| Both sides previous revisionPrevious revisionNext revision | Previous revision |
| en:safeav:ctrl:vctrl [2025/10/24 07:45] – [Scenario-Based Validation with Digital Twins] momala | en:safeav:ctrl:vctrl [2025/10/24 09:43] (current) – [Methods and Metrics for Planning & Control] momala |
|---|
| |
| |
| The V&V workflow begins with a formal scenario description: functional narratives are encoded in a human-readable DSL (e.g., M-SDL/Scenic), then reduced to logical parameter ranges and finally to concrete instantiations selected by DOE. This ensures tests are reproducible, shareable, and traceable from high-level goals down to the numeric seeds that define a specific run. The simulator co-executes these scenarios with under the test stack inside the digital twin, and the V&V interface collects vehicle control signals, virtual sensor streams, and per-run metrics to generate the verdicts required by the safety case. | The V&V workflow begins with a formal scenario description: functional narratives are encoded in a human-readable DSL (e.g., M-SDL/Scenic), then reduced to logical parameter ranges and finally to concrete instantiations selected by DOE. This ensures tests are reproducible, shareable, and traceable from high-level goals down to the numeric seeds that define a specific run. The simulator co-executes these scenarios with under the test algorithms inside the digital twin, and the V&V interface collects vehicle control signals, virtual sensor streams, and per-run metrics to generate the verdicts required by the safety case. |
| |
| To keep coverage broad without sacrificing realism, your program uses a two-layer approach. A low-fidelity (LF) layer (e.g., SUMO) sweeps wide parameter grids quickly to reveal where planning/control begins to stress safety constraints; a high-fidelity (HF) layer (e.g., SVL/LGSVL with Autoware in the loop) then replays the most informative cases with photorealistic sensors and closed-loop actuation. Both layers log the same KPIs so results are comparable and can be promoted to track tests when warranted. This division of labor is central to scaling scenario space while maintaining end-to-end realism for planning and control behaviors like cut-in/out, overtaking, and lane changes. | To maintain broad coverage without sacrificing realism, validations can be done using a two-layer approach shown in Figure 1. A low-fidelity (LF) layer (e.g., SUMO) sweeps wide parameter grids quickly to reveal where planning/control begins to stress safety constraints; a high-fidelity (HF) layer (e.g., a game engine simulator like CARLA with the control software in the loop) then replays the most informative cases with photorealistic sensors and closed-loop actuation. Both layers log the same KPIs, so results are comparable and can be promoted to track tests when warranted. This division of labor is central to scaling scenario space while maintaining end-to-end realism for planning and control behaviors like cut-in/out, overtaking, and lane changes. |
| |
| Formal methods strengthen this flow. In the simulation-to-track pipeline, scenarios and safety properties are specified formally (e.g., via Scenic and Metric Temporal Logic), falsification synthesizes challenging test cases, and a mapping executes those cases on a closed track. In published evidence, a majority of unsafe simulated cases reproduced as unsafe on track, and safe cases mostly remained safe—while time-series comparisons (e.g., DTW, Skorokhod metrics) quantified the sim-to-real differences relevant to planning and control. This is exactly the kind of transferability and measurement discipline a planning/control safety argument needs. | |
| |
| Finally, environment twins are built from aerial photogrammetry and point-cloud processing (with RTK-supported georeferencing), yielding maps and 3D assets that match the real campus, so trajectory-level decisions (overtake, yield, return-to-lane) are evaluated against faithful road geometries and occlusion patterns. | <figure Low and High Fidelity> |
| | {{ :en:safeav:ctrl:low-high_fildelity_simulator.png?400 |Low and High Fidelity Simulators}} |
| | <caption>Fidelity of AV simulation: a) Low-Fidelity SUMO simulator((Pablo Alvarez Lopez, Michael Behrisch, Laura Bieker-Walz, Jakob Erdmann, Yun- |
| | Pang Flötteröd, Robert Hilbrich, Leonhard Lücken, Johannes Rummel, Peter Wag- |
| | ner, and Evamarie Wießner. Microscopic traffic simulation using sumo. In The 21st |
| | IEEE International Conference on Intelligent Transportation Systems. IEEE, 2018.)) b) High-Fidelity AWSIM |
| | simulator ((Autoware Foundation. TIER IV AWSIM. https://github.com/tier4/AWSIM, |
| | 2022.)) </caption> |
| | </figure> |
| | |
| | |
| | Formal methods strengthen this flow. In the simulation-to-track pipeline, scenarios and safety properties are specified formally (e.g., via Scenic and Metric Temporal Logic), falsification synthesizes challenging test cases, and a mapping executes those cases on a closed track((Fremont, Daniel J., et al. "Formal scenario-based testing of autonomous vehicles: From simulation to the real world." 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2020.)). In published evidence, a majority of unsafe simulated cases reproduced as unsafe on track, and safe cases mostly remained safe—while time-series comparisons (e.g., DTW, Skorokhod metrics) quantified the sim-to-real differences relevant to planning and control. This is exactly the kind of transferability and measurement discipline a planning/control safety argument needs. |
| | |
| | Finally, environment twins are built from aerial photogrammetry and point-cloud processing (with RTK-supported georeferencing), yielding maps and 3D assets that match the real campus, so trajectory-level decisions (overtake, yield, return-to-lane) are evaluated against faithful road geometries and occlusion patterns((Pikner, Heiko, et al. "Autonomous Driving Validation and Verification Using Digital Twins." VEHITS (2024): 204-211.)). |
| |
| ====== Methods and Metrics for Planning & Control ====== | ====== Methods and Metrics for Planning & Control ====== |
| Mission-level planning validation starts from a start–goal pair and asks whether the vehicle reaches the destination via a safe, policy-compliant trajectory. Your platform publishes three families of evidence: (i) trajectory-following error relative to the global path; (ii) safety outcomes such as collisions or violations of separation; and (iii) mission success (goal reached without violations). This couples path selection quality to execution fidelity. | Mission-level planning validation starts from a start–goal pair and asks whether the vehicle reaches the destination via a safe, policy-compliant trajectory. Your platform publishes three families of evidence: (i) trajectory-following error relative to the global path; (ii) safety outcomes such as collisions or violations of separation; and (iii) mission success (goal reached without violations). This couples path selection quality to execution fidelity. |
| |
| At the local planning level, your case study focuses on OpenPlanner inside Autoware. OpenPlanner synthesizes a global path and a set of lateral rollouts, then evaluates them under prediction of surrounding actors to select a safe local trajectory for maneuvers like passing and lane changes. By parameterizing scenarios with variables such as the initial separation to the lead vehicle and the lead vehicle’s speed, you create a grid of concrete cases that stress the evaluator’s thresholds. The outcomes are categorized by meaningful labels—Success, Collision, Distance-to-Collision (DTC) violation, excessive deceleration, long pass without return, and timeout—so that planner tuning correlates directly with safety and comfort. | At the local planning level, your case study focuses on the planner inside the autonomous software. The planner synthesizes a global and a local path, then evaluates them based on predictions from surrounding actors to select a safe local trajectory for maneuvers such as passing and lane changes. By parameterizing scenarios with variables such as the initial separation to the lead vehicle and the lead vehicle’s speed, you create a grid of concrete cases that stress the evaluator’s thresholds. The outcomes are categorized by meaningful labels—Success, Collision, Distance-to-Collision (DTC) violation, excessive deceleration, long pass without return, and timeout—so that planner tuning correlates directly with safety and comfort. |
| | |
| | <figure Trajectory Validation> |
| | {{ :en:safeav:ctrl:trajectory_validation.png?300 |Trajectory Validation}} |
| | <caption>Trajectory validation example</caption> |
| | </figure> |
| |
| Control validation links perception-induced delays to braking and steering outcomes. Your framework computes Time-to-Collision (Formula) along with simulator and AV-stack response times to detected obstacles. Sufficient response time allows a safe return to nominal headway; excessive delay predicts collision, sharp braking, or planner oscillations. By logging ground truth, perception outputs, CAN bus commands, and the resulting dynamics, the analysis separates sensing delays from controller latency, revealing where mitigation belongs (planner margins vs. control gains). | Control validation links perception-induced delays to braking and steering outcomes. Your framework computes Time-to-Collision (Formula) along with the simulator and AV-stack response times to detected obstacles. Sufficient response time allows a safe return to nominal headway; excessive delay predicts collision, sharp braking, or planner oscillations. By logging ground truth, perception outputs, CAN bus commands, and the resulting dynamics, the analysis separates sensing delays from controller latency, revealing where mitigation belongs (planner margins vs. control gains). |
| |
| A necessary dependency is localization health. Your tests inject controlled GPS/IMU degradations and dropouts through simulator APIs, then compare expected vs. actual pose per frame to quantify drift. Because planning and control are sensitive to absolute and relative pose, this produces actionable thresholds for safe operation (e.g., maximum tolerated RMS deviation before reducing speed or restricting maneuvers). | A necessary dependency is localization health. Your tests inject controlled GPS/IMU degradations and dropouts through simulator APIs, then compare expected vs. actual pose per frame to quantify drift. Because planning and control are sensitive to absolute and relative pose, this produces actionable thresholds for safe operation (e.g., maximum tolerated RMS deviation before reducing speed or restricting maneuvers). |