Physical Design Flow
STEP 1:
Load the 4 main Inputs i.e.
Netlist [Gate level Netlist]
Lef  [ Technology Lef, Macro Lef, Stdcell Lef]
Libs  [Logical libraries or timing libraries]
SDCs [ Design Constraints]
Lef contains : 
No of metal layers, 
Direction of metal layer H/V, 
Resistance and capacitance per unit square,
 width
 spacing and pitch of all metal layers,
 area,
thickness,
Via information [double cut and single cut]
 a subset of DRC rules.
Macro Lef: All macro information ex dimensions and coordinates, macro pin information.
Std cell Lef: All physical dimensions of the std cells and input pin a b and out, their geometry.
Libs: NLDM’s
         All timing information of the standard cells ex nand, and, or flip-flop.
         Like cell delay information for that much input transition and that much output load.
timing sense i.e. Positive or negative unate.
Setup and hold constraints of the sequential circuits. For that much data transition and that much clock transition Setup time and hold times.
Recovery and removal checks.
Power information:  depending upon Input transition and output capacitance internal and external power. Cell Leakage information.
SDCs: Design Constraints like create_clock, set_clock_latency, set_clock_uncertainty, set_multicyclepath, set_input_delay, set_output_delay, set_max_transition, set_max_capacitance.
STEP 2:
Performing Sanity checks
Sanity checks Include
 1) check_timing –verbose [-all] to report all warnings and to find errors if all flops are getting clock. If the design has any combinational loops.
 2)checkDesign to check out design like how many 
3) checkNetlist
4)report_constraints–all_violators to check total number of paths that are failing ,if more the no. of paths failing then taking it to design and optimizing will increase utilization, hence it should be cleaned at synthesis stage itself.
5) Report_analysis_coverage
6) checkPinAssignment
How much –ve slack can you take to the design?
  It depends upon the WLM’s. If the synthesis team has used best case RC trees in WLM’s the it is better to have positive slack or non-negative slack. Because the vest case RC WLM’s assume that there is no wire resistance in the path it sees only pin capacitance which in real can be over optimistic. If synthesis is done on worst case RC tress then design with –ve slack like -30 to 50ps can be taken and optimized, because worst case tree is more pessimistic approach.
STEP 3:
Floor planning talk about giving large area to the std cells and placing memories to the boundaries how it will cause IR drop.
Congestion issue and how you solved it.
STEP 4:
Power planning:
The main goal of power planning is to achieve the Limit or % of IR drop that is given. Like 2% or 1% or 0.9% of total power.
Initially we design power structure to meet half i.e. if target is 2% we build power structure to meet 1%, its more pessimistic. Then once Routing is done the target of 2% will be met due to more cells after optimization.
Three types of power Static, Leakage and Dynamic.
Leakage power is when the cells are not switching and they are getting power, due to thinner technologies sub-threshold leakage form drain to source and gate oxide leakage due to gate oxide tunneling.
Dynamic power is the power dissipated when the cells are switching.
2 types of dynamic power internal power and external switching power.
Internal power is dissipated for charging internal capacitance and the shortckt current [crowbar current] .
External switching power is the power dissipated for charging and discharging output loads.
Different areas of the chip function at different frequencies on area may be slow one area may be fast. So if we design power structure for dynamic power the after CTS more buffers will be added and switching may vary at each stage of optimization, so for that we need to create a different power structure at each stage which is incorrect. 
So we take average of all switching factors [static power is average switching power] and build a power structure and the fine tune it if there is IR drop.
STEP 5:
Placement:
Timing optimization: Uncertainty included skew+jitter+delay due to OCV derates.
Why we need to give uncertainty
We need to show the post CTS effects in preCts stage itself because once clock is built the postCTS optimization will not touch flops and clocks. We need to play with cells to solve timing and we can use usefulskew. But if we give pessimism to show postCTS effects in preCts the tool will optimize refine placement and give better placement for the later effects.
Why solve only setup at PreCts?
At preCts stage the clock is Ideal it’s not propagated, so if you look at the internal circuit of flop the clock to q delay + the propagation delay cannot be less that the hold time [Buffer delay]
Hence there can be no hold violations at preCts stage, but still we see some violations because of the uncertainty values that we have given.
If we clean hold at this stage with the improper uncertainties because we don’t know the uncertainties we have given is correct so the tool will insert lot of buffers which are unnecessary. It will increase area, utilization, leakage, congestion.
Once the clock is propagated we can solve the hold after running placement with reduced uncertainties.
STEP 6: 
Clock Tree Synthesis:
Once setup is clean we can move on to build a clock tree.
Input to CTS: 
All required libraries & tech rule files imported
SDC
Floorplanning & Power planning done
Standard cells are placed and optimization done
Congestion analyzed and within limits
Placement database
Clock specification File: 
Clock name
   Clock period
   Max Latency
   targeted Skew
   Max trans, Max cap
   Buffers and Inverters
   NDR’s
   Macro  Model.
Build CTS for non-leaf nets use 6 and 7 layer and for leaf nets use 3 and 4 and 5.
Because if NDR’s are used for leaf nets also it can cause routing congestion.
Which do you prefer buffers or inverters to build a clock tree?
 It is better to use both buffers and inverters but inverters should be even. And delay due to buffer is more but if use only inverters and if there are max Trans violations then again you need to use buffer to rectify it.
How do you say your CTS is good?
CTS is good when  1) Latency is less.
                                 2) Skew is minimum.
                                 3) Less buffers.
                                 4) Less number of levels.
What happens if Latency is more?
# Clock takes longer time to reach the flops. Though the Skews are balanced it takes longer time so Chip will be slow. We need faster chips so target min latency.
# Clock travels different areas of the chip. Different areas will have different OCV’s, It can impact clock and delays may vary in different areas of chip.
#once we apply derates after CTS and suppose n/w latency is more, delay may increase and we may see more no. of timing violations.
# Suppose latency is 2ns and after derates 2x1.02=2.04, likewise if its 4 then 4x1.02=4.08, 8x1.02=8.16 so derates factor is increasing the delay, arrival time will be delayed.
STEP 7:
POST CTS OPTIMIZATION:
Apply Derates and remove CPPR. If we don’t remove CPPR it will be over pessimism Optimize for both setup and hold.
1.Talk about Derates, OCV’s, CPPR.
STEP 8:
ROUTING:
Route the Design. 
After routing you’ll see that the setup violations are increased that’s because till this point the routing was trial route and delays were estimated based upon the trial route. Once actual routing is done then we get actual RC’s. 
There will be no or less hold violations because the delays are more which will be a plus point for hold margins.
Signoff Flow
 Routing
                                              Extract SPEF    cmd<extractRC>   with (captable)   
                                                                                        Coupling Capacitance will be included with 
                                                                                         ground cap
Setup/Hold Optimization.
                                              Extract RC with coupling capacitance.
cmd<extractRC –coupling true>
	
setup&hold optimization with SI ,cmd<optDesign –postRoute –si>
	                                   Dump [ Netlist, Lef, DEF] with captable.  Give it to StartXtract and it will     
                                  dump a SPEF with which STA signoff is done        
USEFUL SKEW
Case 1
To solve setup time of FF2.
You need to check the hold margin of FF2 then setup margin of FF3.
FF1
FF2
FF2
FF3
Case 2
To solve hold time of FF2.
You need to check setup margin of FF2 and hold margin of FF0.
FF1
FF2
FF0
FF1
Comments
Post a Comment