Decision making under uncertainty : theory and application

Detalles Bibliográficos
Autor Principal: Kochenderfer, Mykel J.
Otros autores o Colaboradores: Amato, Christopher, Vian, John, How, Jonathan P., Chowdhary, Girish, Üre, N. Kemal, Torres-Carrasquillo, Pedro A., Thornton, Jason R., Davison Reynolds, Hayley J.
Formato: Libro
Lengua:inglés
Datos de publicación: Cambridge : MIT Press, 2015
Edición:1st ed.
Series:MIT Lincoln Laboratory Series
Temas:
Acceso en línea:Consultar en el Cátalogo
Notas:Incluye índice.
Descripción Física:xxv, 323 p. : il.
ISBN:9780262029254
Tabla de Contenidos:
  • 1. Introduction
  • 1.1. Decision Making
  • 1.2. Example Applications
  • 1.2.1. Traffic Alert and Collision Avoidance System
  • 1.2.2. Unmanned Aircraft Persistent Surveillance
  • 1.3. Methods for Designing Decision Agents
  • 1.3.1. Explicit Programming
  • 1.3.2. Supervised Learning
  • 1.3.3. Optimization
  • 1.3.4. Planning
  • 1.3.5. Reinforcement Learning
  • 1.4. Overview
  • 1.5. Further Reading References
  • 2. Probabilistic Models
  • 2.1. Representation
  • 2.1.1. Degrees of Belief and Probability
  • 2.1.2. Probability Distributions
  • 2.1.3. Joint Distributions
  • 2.1.4. Bayesian Network Representation
  • 2.1.5. Conditional Independence
  • 2.1.6. Hybrid Bayesian Networks
  • 2.1.7. Temporal Models
  • 2.2. Inference
  • 2.2.1. Inference for Classification
  • 2.2.2. Inference in Temporal Models
  • 2.2.3. Exact Inference
  • 2.2.4. Complexity of Exact Inference
  • 2.2.5. Approximate Inference
  • 2.3. Parameter Learning
  • 2.3.1. Maximum Likelihood Parameter Learning
  • 2.3.2. Bayesian Parameter Learning
  • 2.3.3. Nonparametric Learning
  • 2.4. Structure Learning
  • 2.4.1. Bayesian Structure Scoring
  • 2.4.2. Directed Graph Search
  • 2.4.3. Markov Equivalence Classes
  • 2.4.4. Partially Directed Graph Search
  • 2.5. Summary
  • 2.6. Further Reading References
  • 3. Decision Problems
  • 3.1. Utility Theory
  • 3.1.1. Constraints on Rational Preferences
  • 3.1.2. Utility Functions
  • 3.1.3. Maximum Expected Utility Principle
  • 3.1.4. Utility Elicitation
  • 3.1.5. Utility of Money
  • 3.1.6. Multiple Variable Utility Functions
  • 3.1.7. Irrationality
  • 3.2. Decision Networks
  • 3.2.1. Evaluating Decision Networks
  • 3.2.2. Value of Information
  • 3.2.3. Creating Decision Networks
  • 3.3. Games
  • 3.3.1. Dominant Strategy Equilibrium
  • 3.3.2. Nash Equilibrium
  • 3.3.3. Behavioral Game Theory
  • 3.4. Summary
  • 3.5. Further Reading References
  • 4. Sequential Problems
  • 4.1. Formulation
  • 4.1.1. Markov Decision Processes
  • 4.1.2. Utility and Reward
  • 4.2. Dynamic Programming
  • 4.2.1. Policies and Utilities
  • 4.2.2. Policy Evaluation
  • 4.2.3. Policy Iteration
  • 4.2.4. Value Iteration
  • 4.2.5. Grid World Example
  • 4.2.6. Asynchronous Value Iteration
  • 4.2.7. Closed- and Open-Loop Planning
  • 4.3. Structured Representations
  • 4.3.1. Factored Markov Decision Processes
  • 4.3.2. Structured Dynamic Programming
  • 4.4. Linear Representations
  • 4.5. Approximate Dynamic Programming
  • 4.5.1. Local Approximation
  • 4.5.2. Global Approximation
  • 4.6. Online Methods
  • 4.6.1. Forward Search
  • 4.6.2. Branch and Bound Search
  • 4.6.3. Sparse Sampling
  • 4.6.4. Monte Carlo Tree Search
  • 4.7. Direct Policy Search
  • 4.7.1. Objective Function
  • 4.7.2. Local Search Methods
  • 4.7.3. Cross Entropy Methods
  • 4.7.4. Evolutionary Methods
  • 4.8. Summary
  • 4.9. Further Reading References
  • 5. Model Uncertainty
  • 5.1. Exploration and Exploitation
  • 5.1.1. Multi-Armed Bandit Problems
  • 5.1.2. Bayesian Model Estimation
  • 5.1.3. Ad Hoc Exploration Strategies
  • 5.1.4. Optimal Exploration Strategies
  • 5.2. Maximum Likelihood Model-Based Methods
  • 5.2.1. Randomized Updates
  • 5.2.2. Prioritized Updates
  • 5.3. Bayesian Model-Based Methods
  • 5.3.1. Problem Structure
  • 5.3.2. Beliefs over Model Parameters
  • 5.3.3. Bayes-Adaptive Markov Decision Processes
  • 5.3.4. Solution Methods
  • 5.4. Model-Free Methods
  • 5.4.1. Incremental Estimation
  • 5.4.2. Q-Learning
  • 5.4.3. Sarsa
  • 5.4.4. Eligibility Traces
  • 5.5. Generalization
  • 5.5.1. Local Approximation
  • 5.5.2. Global Approximation
  • 5.5.3. Abstraction Methods
  • 5.6. Summary
  • 5.7. Further Reading References
  • 6. State Uncertainty
  • 6.1. Formulation
  • 6.1.1. Example Problem
  • 6.1.2. Partially Observable Markov Decision Processes
  • 6.1.3. Policy Execution
  • 6.1.4. Belief-State Markov Decision Processes
  • 6.2. Belief Updating
  • 6.2.1. Discrete State Filter
  • 6.2.2. Linear-Gaussian Filter
  • 6.2.3. Particle Filter
  • 6.3. Exact Solution Methods
  • 6.3.1. Alpha Vectors
  • 6.3.2. Conditional Plans
  • 6.3.3. Value Iteration
  • 6.4. Offline Methods
  • 6.4.1. Fully Observable Value Approximation
  • 6.4.2. Fast Informed Bound
  • 6.4.3. Point-Based Value Iteration
  • 6.4.4. Randomized Point-Based Value Iteration
  • 6.4.5. Point Selection
  • 6.4.6. Linear Policies
  • 6.5. Online Methods
  • 6.5.1. Lookahead with Approximate Value Function
  • 6.5.2. Forward Search
  • 6.5.3. Branch and Bound
  • 6.5.4. Monte Carlo Tree Search
  • 6.6. Summary
  • 6.7. Further Reading References
  • 7. Cooperative Decision Making
  • 7.1. Formulation
  • 7.1.1. Decentralized POMDPs
  • 7.1.2. Example Problem
  • 7.1.3. Solution Representations
  • 7.2. Properties
  • 7.2.1. Differences with POMDPs
  • 7.2.2. Dec-POMDP Complexity
  • 7.2.3. Generalized Belief States
  • 7.3. Notable Subclasses
  • 7.3.1. Dec-MDPs
  • 7.3.2. ND-POMDPs
  • 7.3.3. MMDPs
  • 7.4. Exact Solution Methods
  • 7.4.1. Dynamic Programming
  • 7.4.2. Heuristic Search
  • 7.4.3. Policy Iteration
  • 7.5. Approximate Solution Methods
  • 7.5.1. Memory-Bounded Dynamic Programming
  • 7.5.2. Joint Equilibrium Search
  • 7.6. Communication
  • 7.7. Summary
  • 7.8. Further Reading References
  • 8. Probabilistic Surveillance Video Search
  • 8.1. Attribute-Based Person Search
  • 8.1.1. Applications
  • 8.1.2. Person Detection
  • 8.1.3. Retrieval and Scoring
  • 8.2. Probabilistic Appearance Model
  • 8.2.1. Observed States
  • 8.2.2. Basic Model Structure
  • 8.2.3. Model Extensions
  • 8.3. Learning and Inference Techniques
  • 8.3.1. Parameter Learning
  • 8.3.2. Hidden State Inference
  • 8.3.3. Scoring Algorithm
  • 8.4. Performance
  • 8.4.1. Search Accuracy
  • 8.4.2. Search Timing
  • 8.5. Interactive Search Tool
  • 8.6. Summary References
  • 9. Dynamic Models for Speech Applications
  • 9.1. Modeling Speech Signals
  • 9.1.1. Feature Extraction
  • 9.1.2. Hidden Markov Models
  • 9.1.3. Gaussian Mixture Models
  • 9.1.4. Expectation-Maximization Algorithm
  • 9.2. Speech Recognition
  • 9.3. Topic Identification
  • 9.4. Language Recognition
  • 9.5. Speaker Identification
  • 9.5.1. Forensic Speaker Recognition
  • 9.6. Machine Translation
  • 9.7. Summary References
  • 10. Optimized Airborne Collision Avoidance
  • 10.1. Airborne Collision Avoidance Systems
  • 10.1.1. Traffic Alert and Collision Avoidance System
  • 10.1.2. Limitations of Existing System
  • 10.1.3. Unmanned Aircraft Sense and Avoid
  • 10.1.4. Airborne Collision Avoidance System X
  • 10.2. Collision Avoidance Problem Formulation
  • 10.2.1. Resolution Advisories
  • 10.2.2. Dynamic Model
  • 10.2.3. Reward Function
  • 10.2.4. Dynamic Programming
  • 10.3. State Estimation
  • 10.3.1. Sensor Error
  • 10.3.2. Pilot Response
  • 10.3.3. Time to Potential Collision
  • 10.4. Real-Time Execution
  • 10.4.1. Online Costs
  • 10.4.2. Multiple Threats
  • 10.4.3. Traffic Alerts
  • 10.5. Evaluation
  • 10.5.1. Safety Analysis
  • 10.5.2. Operational Suitability and Acceptability
  • 10.5.3. Parameter Tuning
  • 10.5.4. Flight Test
  • 10.6. Summary References
  • 11. Multiagent Planning for Persistent Surveillance
  • 11.1. Mission Description
  • 11.2. Centralized Problem Formulation
  • 11.2.1. State Space
  • 11.2.2. Action Space
  • 11.2.3. State Transition Model
  • 11.2.4. Reward Function
  • 11.3. Decentralized Approximate Formulations
  • 11.3.1. Factored Decomposition
  • 11.3.2. Group Aggregate Decomposition
  • 11.3.3. Planning
  • 11.4. Model Learning
  • 11.5. Flight Test
  • 11.6. Summary References
  • 12. Integrating Automation with Humans
  • 12.1. Human Capabilities and Coping
  • 12.1.1. Perceptual and Cognitive Capabilities
  • 12.1.2. Naturalistic Decision Making
  • 12.2. Considering the Human in Design
  • 12.2.1. Trust and Value of Decision Logic Transparency
  • 12.2.2. Designing for Different Levels of Certainty
  • 12.2.3. Supporting Decisions over Long Timescales
  • 12.3. A Systems View of Implementation
  • 12.3.1. Interface, Training, and Procedures
  • 12.3.2. Measuring Decision Support Effectiveness
  • 12.3.3. Organization Influences on System Effectiveness
  • 12.4. Summary
  • References
  • Index