Decision making under uncertainty : theory and application
Autor Principal: | |
---|---|
Otros autores o Colaboradores: | , , , , , , , |
Formato: | Libro |
Lengua: | inglés |
Datos de publicación: |
Cambridge :
MIT Press,
2015
|
Edición: | 1st ed. |
Series: | MIT Lincoln Laboratory Series
|
Temas: | |
Acceso en línea: | Consultar en el Cátalogo |
Notas: | Incluye índice. |
Descripción Física: | xxv, 323 p. : il. |
ISBN: | 9780262029254 |
Tabla de Contenidos:
- 1. Introduction
- 1.1. Decision Making
- 1.2. Example Applications
- 1.2.1. Traffic Alert and Collision Avoidance System
- 1.2.2. Unmanned Aircraft Persistent Surveillance
- 1.3. Methods for Designing Decision Agents
- 1.3.1. Explicit Programming
- 1.3.2. Supervised Learning
- 1.3.3. Optimization
- 1.3.4. Planning
- 1.3.5. Reinforcement Learning
- 1.4. Overview
- 1.5. Further Reading References
- 2. Probabilistic Models
- 2.1. Representation
- 2.1.1. Degrees of Belief and Probability
- 2.1.2. Probability Distributions
- 2.1.3. Joint Distributions
- 2.1.4. Bayesian Network Representation
- 2.1.5. Conditional Independence
- 2.1.6. Hybrid Bayesian Networks
- 2.1.7. Temporal Models
- 2.2. Inference
- 2.2.1. Inference for Classification
- 2.2.2. Inference in Temporal Models
- 2.2.3. Exact Inference
- 2.2.4. Complexity of Exact Inference
- 2.2.5. Approximate Inference
- 2.3. Parameter Learning
- 2.3.1. Maximum Likelihood Parameter Learning
- 2.3.2. Bayesian Parameter Learning
- 2.3.3. Nonparametric Learning
- 2.4. Structure Learning
- 2.4.1. Bayesian Structure Scoring
- 2.4.2. Directed Graph Search
- 2.4.3. Markov Equivalence Classes
- 2.4.4. Partially Directed Graph Search
- 2.5. Summary
- 2.6. Further Reading References
- 3. Decision Problems
- 3.1. Utility Theory
- 3.1.1. Constraints on Rational Preferences
- 3.1.2. Utility Functions
- 3.1.3. Maximum Expected Utility Principle
- 3.1.4. Utility Elicitation
- 3.1.5. Utility of Money
- 3.1.6. Multiple Variable Utility Functions
- 3.1.7. Irrationality
- 3.2. Decision Networks
- 3.2.1. Evaluating Decision Networks
- 3.2.2. Value of Information
- 3.2.3. Creating Decision Networks
- 3.3. Games
- 3.3.1. Dominant Strategy Equilibrium
- 3.3.2. Nash Equilibrium
- 3.3.3. Behavioral Game Theory
- 3.4. Summary
- 3.5. Further Reading References
- 4. Sequential Problems
- 4.1. Formulation
- 4.1.1. Markov Decision Processes
- 4.1.2. Utility and Reward
- 4.2. Dynamic Programming
- 4.2.1. Policies and Utilities
- 4.2.2. Policy Evaluation
- 4.2.3. Policy Iteration
- 4.2.4. Value Iteration
- 4.2.5. Grid World Example
- 4.2.6. Asynchronous Value Iteration
- 4.2.7. Closed- and Open-Loop Planning
- 4.3. Structured Representations
- 4.3.1. Factored Markov Decision Processes
- 4.3.2. Structured Dynamic Programming
- 4.4. Linear Representations
- 4.5. Approximate Dynamic Programming
- 4.5.1. Local Approximation
- 4.5.2. Global Approximation
- 4.6. Online Methods
- 4.6.1. Forward Search
- 4.6.2. Branch and Bound Search
- 4.6.3. Sparse Sampling
- 4.6.4. Monte Carlo Tree Search
- 4.7. Direct Policy Search
- 4.7.1. Objective Function
- 4.7.2. Local Search Methods
- 4.7.3. Cross Entropy Methods
- 4.7.4. Evolutionary Methods
- 4.8. Summary
- 4.9. Further Reading References
- 5. Model Uncertainty
- 5.1. Exploration and Exploitation
- 5.1.1. Multi-Armed Bandit Problems
- 5.1.2. Bayesian Model Estimation
- 5.1.3. Ad Hoc Exploration Strategies
- 5.1.4. Optimal Exploration Strategies
- 5.2. Maximum Likelihood Model-Based Methods
- 5.2.1. Randomized Updates
- 5.2.2. Prioritized Updates
- 5.3. Bayesian Model-Based Methods
- 5.3.1. Problem Structure
- 5.3.2. Beliefs over Model Parameters
- 5.3.3. Bayes-Adaptive Markov Decision Processes
- 5.3.4. Solution Methods
- 5.4. Model-Free Methods
- 5.4.1. Incremental Estimation
- 5.4.2. Q-Learning
- 5.4.3. Sarsa
- 5.4.4. Eligibility Traces
- 5.5. Generalization
- 5.5.1. Local Approximation
- 5.5.2. Global Approximation
- 5.5.3. Abstraction Methods
- 5.6. Summary
- 5.7. Further Reading References
- 6. State Uncertainty
- 6.1. Formulation
- 6.1.1. Example Problem
- 6.1.2. Partially Observable Markov Decision Processes
- 6.1.3. Policy Execution
- 6.1.4. Belief-State Markov Decision Processes
- 6.2. Belief Updating
- 6.2.1. Discrete State Filter
- 6.2.2. Linear-Gaussian Filter
- 6.2.3. Particle Filter
- 6.3. Exact Solution Methods
- 6.3.1. Alpha Vectors
- 6.3.2. Conditional Plans
- 6.3.3. Value Iteration
- 6.4. Offline Methods
- 6.4.1. Fully Observable Value Approximation
- 6.4.2. Fast Informed Bound
- 6.4.3. Point-Based Value Iteration
- 6.4.4. Randomized Point-Based Value Iteration
- 6.4.5. Point Selection
- 6.4.6. Linear Policies
- 6.5. Online Methods
- 6.5.1. Lookahead with Approximate Value Function
- 6.5.2. Forward Search
- 6.5.3. Branch and Bound
- 6.5.4. Monte Carlo Tree Search
- 6.6. Summary
- 6.7. Further Reading References
- 7. Cooperative Decision Making
- 7.1. Formulation
- 7.1.1. Decentralized POMDPs
- 7.1.2. Example Problem
- 7.1.3. Solution Representations
- 7.2. Properties
- 7.2.1. Differences with POMDPs
- 7.2.2. Dec-POMDP Complexity
- 7.2.3. Generalized Belief States
- 7.3. Notable Subclasses
- 7.3.1. Dec-MDPs
- 7.3.2. ND-POMDPs
- 7.3.3. MMDPs
- 7.4. Exact Solution Methods
- 7.4.1. Dynamic Programming
- 7.4.2. Heuristic Search
- 7.4.3. Policy Iteration
- 7.5. Approximate Solution Methods
- 7.5.1. Memory-Bounded Dynamic Programming
- 7.5.2. Joint Equilibrium Search
- 7.6. Communication
- 7.7. Summary
- 7.8. Further Reading References
- 8. Probabilistic Surveillance Video Search
- 8.1. Attribute-Based Person Search
- 8.1.1. Applications
- 8.1.2. Person Detection
- 8.1.3. Retrieval and Scoring
- 8.2. Probabilistic Appearance Model
- 8.2.1. Observed States
- 8.2.2. Basic Model Structure
- 8.2.3. Model Extensions
- 8.3. Learning and Inference Techniques
- 8.3.1. Parameter Learning
- 8.3.2. Hidden State Inference
- 8.3.3. Scoring Algorithm
- 8.4. Performance
- 8.4.1. Search Accuracy
- 8.4.2. Search Timing
- 8.5. Interactive Search Tool
- 8.6. Summary References
- 9. Dynamic Models for Speech Applications
- 9.1. Modeling Speech Signals
- 9.1.1. Feature Extraction
- 9.1.2. Hidden Markov Models
- 9.1.3. Gaussian Mixture Models
- 9.1.4. Expectation-Maximization Algorithm
- 9.2. Speech Recognition
- 9.3. Topic Identification
- 9.4. Language Recognition
- 9.5. Speaker Identification
- 9.5.1. Forensic Speaker Recognition
- 9.6. Machine Translation
- 9.7. Summary References
- 10. Optimized Airborne Collision Avoidance
- 10.1. Airborne Collision Avoidance Systems
- 10.1.1. Traffic Alert and Collision Avoidance System
- 10.1.2. Limitations of Existing System
- 10.1.3. Unmanned Aircraft Sense and Avoid
- 10.1.4. Airborne Collision Avoidance System X
- 10.2. Collision Avoidance Problem Formulation
- 10.2.1. Resolution Advisories
- 10.2.2. Dynamic Model
- 10.2.3. Reward Function
- 10.2.4. Dynamic Programming
- 10.3. State Estimation
- 10.3.1. Sensor Error
- 10.3.2. Pilot Response
- 10.3.3. Time to Potential Collision
- 10.4. Real-Time Execution
- 10.4.1. Online Costs
- 10.4.2. Multiple Threats
- 10.4.3. Traffic Alerts
- 10.5. Evaluation
- 10.5.1. Safety Analysis
- 10.5.2. Operational Suitability and Acceptability
- 10.5.3. Parameter Tuning
- 10.5.4. Flight Test
- 10.6. Summary References
- 11. Multiagent Planning for Persistent Surveillance
- 11.1. Mission Description
- 11.2. Centralized Problem Formulation
- 11.2.1. State Space
- 11.2.2. Action Space
- 11.2.3. State Transition Model
- 11.2.4. Reward Function
- 11.3. Decentralized Approximate Formulations
- 11.3.1. Factored Decomposition
- 11.3.2. Group Aggregate Decomposition
- 11.3.3. Planning
- 11.4. Model Learning
- 11.5. Flight Test
- 11.6. Summary References
- 12. Integrating Automation with Humans
- 12.1. Human Capabilities and Coping
- 12.1.1. Perceptual and Cognitive Capabilities
- 12.1.2. Naturalistic Decision Making
- 12.2. Considering the Human in Design
- 12.2.1. Trust and Value of Decision Logic Transparency
- 12.2.2. Designing for Different Levels of Certainty
- 12.2.3. Supporting Decisions over Long Timescales
- 12.3. A Systems View of Implementation
- 12.3.1. Interface, Training, and Procedures
- 12.3.2. Measuring Decision Support Effectiveness
- 12.3.3. Organization Influences on System Effectiveness
- 12.4. Summary
- References
- Index