Is System Level Test Practical At The Wafer Level ON ATE For 3D-IC Processors? Gregory Smith Teradyne
System Level Test • • • •
• • •
Typically the last test insertion before shipment Main purpose of the test is to “Boot” Boot the processor and run self testing diagnostics Sometimes system level test is a permanent part of the test flow. p Often SLT is used on devices until functional test coverage is high enough to eliminate the insertion Current commercial solutions support 6 sites in parallel Test times can be minutes long Yields are usually very high (>>98%)
Photo: Chroma ATE
System Level Test Details • •
•
• • • •
Tester applies power to device DRAM Debug Test Device loads bootstrap loader from Audio Port Controll (usualy er (PC) flash memory into DRAM and looped back) Device executes loader. Under Test Loader loads OS Kernel image Power Display Manag from Flash into DRAM, and then (usually ement to an Partner Flash Memory (with launches the OS HDMI devices bootstrap loader, and rcvr) kernel image OS automatically starts diagnostic routines as a startup program Loader size: ~0.5 to 2MB Diagnostic routines test all cores, OS Kernel Size: ~ 5 to 20MB graphics interfaces and power modes graphics, Test results are written to diagnostic port (UART) Tester looks for “Test Passed” message to bin the part
Mobile Processor Production Process Changes Mobile Processor w/ ext Memory FAB
Wafer Test
Assemble
Pkg Test
Bad Die
Burn-in
Die + Pkg
System Level Test
Post BI Test Die + Pkg
Die + Pkg Scrap
Mobile Processor w/ Package On Package Memory FAB
Wafer Test
Assemble MAP die
Add POP Memory
Bad Die
Pkg Test
Burn-in
Die, Pkg, Mem
Post BI Test
System Level Test
Die, Pkg, Mem Scrap
Mobile Processor w/ Wide IO Memory FAB
Wafer Test Bad Die
Add Memory Cube
Assemble
Pkg Test
Burn-in
Die, Pkg, Mem
Post BI Test
System Level Test
Die, Pkg, Mem Scrap
What if … SLT at Wafer Level Test Audio
Host
SLT at Package Test 3D IC Assembly
Passing Die
Initial Package Test
Burn-in
Post BI Test
DUT
Image Credit : Qualcomm
Flash
PMIC
Memory Dice
SLT Coverage at Wafer Test Host Emulation
DRAM Emulation
Memory Dice
Audio Tests 3D IC Assembly
Passing g Die Wafer level Burn-in
DUT
Image Credit : Qualcomm
Bad B dD Devices i Identified at Wafer Level
Display Interface
Flash Emulation
Protocol Aware ATE Power Mode Tests
Final Package Test
Low DPM
Yield Loss only due to Assembly defects
SLT on ATE – Challenges / Strategies Challenges • Technical – Need to emulate device interfaces in real time (including memory)
•
– Protocol Aware capable ATE solutions for many interfaces – WideIO interfaces unsolved
Interfacing – Need to support full at speed performance on a probe card
•
Strategies
– Leverage Probe technology in use for RF ICs, GPU and MPUs
Commercial – SLT is cheap! Couple of thousand bucks for a motherboard, plus handler and powers supplies. How can an expensive i ATE compete? t ?
– ROI needs to account for: • Reduced of scrap costs • Faster failure analysis • Improved Time to Market
3D-IC Could Radically Increases Scrap Cost single die package Conventional SLT die cost WS test cost WS yield WS scrap cost Memory Cube Cost Package cost assembly cost FT Test cost FT yield FT scrap cost SLT Test Cost SLT Yield SLT Scrap Cost total COGS Total T t l COT Total Scrap Cost Total COT + Scrap
$ $
single die package WL-SLT
processor with POP memory conventional SLT $ 2.000 $ 0.100 90% $ 0.210 $ 10.000 $ 1.000 $ 0.500 $ 0.100 98% $ 0.278 $ 0.100 99.5% $ 0.070
Processor with POP memory WL-SLT $ 2.000 $ 0.300 88% $ 0.276 $ 10.000 $ 1.000 $ 0.500 $ 0.100 99.5% $ 0.071
2.000 $ 0.100 $ 90% 0.210 $
2.000 0.300 88% 0.276 0.500 0.100 0.100 99.5% 0.016
$
0.500 0.100 0.100 98% 0.060 0.100 99.5% 0.016
$
3.186 $
3.292 $
14.358 $
$ 0 300 $ 0.300 $ 0.286 $ $ 0.586 $ Change from baseline
0.400 0 400 $ 0.292 $ 0.692 $ -18.2%
0.300 0 300 $ 0.558 $ 0.858 $
$ $ $ $ $ $
$ $ $ $
14.247 0.400 0 400 0.347 0.747 13.0%
SLT on ATE Prevents Idle Test Cells ATE Capacity
SLT Capacity
Device Volume
ATE with SLT Capacity
30
300
30
250
25
250
25
200
20
150
15
100
10
5
Idle SLT 20 S t Setups
200
Weekly Volume (K)
300
# of Test Cells
Weekly Volume (K)
Device Volume
ATE and SLT Capacity
150
15
100
10
50
5
50
0
0
0 0
10
20
30
40
Weeks from Production Release
50
60
0 0
10
20
30
40
50
60
Weeks from Production Release
Eliminates excess SLT capacity as SLT test time is reduced Increases velocity of improvements to structural tests from SLT with better FA Reduces device bring up time for initial sample test
# of Test Cells
ATE and SLT Capacity
SLT on ATE is Enabled by Protocol Aware ATE Digital Card
“stored response” digital
Host Computer
Logic Patgen g
FPGA Based Protocol E i Engines
•
Pin Electronics
DSSC
T
T
Timing
Transaction Memory
g g PA Architecture integrated into Digital Instrument – Select Protocol Aware or Standard Digital on any pin – Used Together with Scan, BIST, etc.
• •
“Real Real Time Intelligence Intelligence” To communicate with DUT FPGA architecture allows flexibility and low latency
DUT
ATE Memory Emulation State of the Art - 2012 Device D i IInterface t f requirements i t Flash Memory Emulate EMMC protocol Image supported LP-DDR LP DDR Provide LP-DDR3 I/F Interface speeds Read Latency Memory Size Wide IO Memory Provide Wide IO pin count Provide very low load C (<1pf) Reliably contact >500 500 microbumps Emulate Wide IO protocol
C Current t SLT implementation i l t ti C Current t ATE C Capability bilit
P t ti l ATE Capability Potential C bilit
Yes ~20MB
Supported Protocol >20MW
Supported Protocol >20MW
Uses DRAM device 400 to 1600Mbps now Up to 4.1 by 2014 <10 cycles ~20MB 20MB (Kernel)
DDR Emulation To 1067Mbps >>10 cycles 64KW
DDR Emulation faster, but emulation at 4.1 is tough >>10 cycles double?
Too expensive need ~$100/pin loads > 20pf microbumps too close PA limited to 1/2 board (128 pins)
solvable, sacrifice features custom buffer on Probe card? continued R&D in probe cards solvable
Perform Test after y assy y memory
DDR Memory Emulation Challenge
• • • •
Physical distance from DUT to Memory, plus buffering adds latency Use of internal FPGA memory (Max 10Mb) limits emulation size Possible to use external memory memory, but with large increase in latency for additional DDR controller in FPGA Crossing time domain from device DDR timing to internal digital instrument timing g requires q retiming g with PLLs. This makes rate changes g tough g to follow
Is it practical to replace SLT with tests at an ATE insertion? •
For LP-DDR POP and PIP design, yes… –
Memory y Controller testability y may y require q enhancement •
–
OS Kernel requires a radical size reduction to utilize emulated DRAM • • •
–
•
Tolerate longer read latency, constant DDR rates Current Loader and OS Kernel include large numbers of unused services A test specific “Tiny Loader” and “Tiny Kernel” could to be developed Bonus: A “Tiny Kernel” will load and execute more quickly
Test coverage would equal SLT, but would not use the same OS as the target application
For WideIO memory designs – ATE per pin digital prices need to come down by 5x – A solution to low capacitance drive must be developed – A solution to probe high density microbumps must be developed – Until that happens, emulating WideIO memory for System Level Testing on Wafer is not yet practical.
Conclusion Memory Dice Host Emulation
DRAM Emulation
Audio Tests 3D IC Assembly
Passing Die Wafer level Burn-in
DUT
Image Credit : Qualcomm
Bad Devices Identified at Wafer Level • •
•
Flash Emulation
Power Mode Tests
Final Package Test
Yield Loss only y due to Assembly defects
Increasing test coverage at Wafer Test, including SLT coverage is possible for LP-DDR designs, and has been validated to reduce scrap, improve final yield and accelerate time to volume Focused efforts from ATE suppliers and device manufacturers is needed to – – –
•
Display Interface
Protocol Aware ATE
Low DPM
Improvement ATE capability Improvement device testability Develop “Tiny Loader” and “Tiny Kernel”
Providing WideIO memory emulation on ATE will require significant R&D to solve electrical and mechanical challenges g before it can be used in p production. Yield loss after assembly of 3D ICs could limit adoption in more cost sensitive markets