non-param filters - particle filter and histogram filter summary

2021-10-04

Particle filter intro

The state estimation filtering will use the KF, PF, memberfship fitlers to fix like below.

Gradient Descent

PF could be used in both linear system and non-linear system. For the linear system, it required that:

Gradient Descent

Why PF? Most of systems are non-linear, and Gaussian noise will be violated sometimes. PF uses the Monte Carlo methods to simulate the samples, according to the large number theory, as particles get large, the empirical distribution gets close to the true distribution.

Gradient Descent

Steps

x: state variable; u: inputs; z: observations; d: data.

Gradient Descent

  • Prediction:

Gradient Descent

  • Update:

Gradient Descent

  • Resample: sample importance sampling. (particle deletion; high probability in MAP not represent well; density could not represent the real pdf) Importance resampling. poster / prior with observation model.

Gradient Descent

  • Output estimate state

Gradient Descent

This process could be wrote as:

Gradient Descent

Improve: Rao-Blackwellised Particle Filter

The aim of Rao-Blackwellised Particle Filtering is to find an estimator of the conditional distribution ![{\displaystyle p(y_{t} z_{t})}](https://wiki.ubc.ca/api/rest_v1/media/math/render/svg/e98e6a7686d803cd1f7b676db7a2b212139a3e56) such that less particles will be required to reach the same accuracy as a typical particle filter.

split the posterior probability into two sets, one could be calculated by closed form(margin probability accumlated) and other could be calculated by the PF.

Gradient Descent

Application

Summary

  1. Why PF? Advantage & disadvantages.
  2. How to do with PF?
  3. Improve of PF?

Histogram filter

Another non-parameter method, and using the grid to represent the state. The formula very similar to PF.

Gradient Descent

More state estimation with parametric filters are:

KF & EKF and summary

KL divergence: Consider two probability distributions P and . Usually, P represents the data, the observations, or a probability distribution precisely measured. DistributionQ represents instead a theory, a model, a description or an approximation of P. The Kullback–Leibler divergence is then interpreted as the average difference of the number of bits required for encoding samples of P using a code optimized for Q rather than one optimized for P. To measure the observe distribute P is similar to guess Q. Is the same as cross entropy in wiki.

Gradient Descent

KLD sample:

Gradient Descent

With relationship with cross-entropy:

Gradient Descent

Ref:

  1. State Estimation Summary

  2. Linear System

  3. MIT particle filter PF and application

  4. Monte Carlo Localization

  5. Real-time PF

  6. Rao-Blackwellised PF

  7. Histogram filter

  8. HF2