Test the Linear Gaussian Model: A comparison with Kalman-filtering algorithm
Testing pypomp 0.0.6 on 2025-05-15 at run level 2 (0 is for debugging, 2 is full-length).
Model Setup
We consider a linear Gaussian dynamic system, \[\begin{align}
X_n &= A X_{n-1} + W_n
\end{align}\] where \(X_n\) is the current state vector, \(A\) is the state transition matrix, and \(W_n \sim \mathcal{N}(0, Q)\) is the process noise, normally distributed with mean \(0\) and covariance \(Q\). The measurement model is \[\begin{align}
Y_n &= C X_n + V_n,
\end{align}\] where \(Y_n\) is the observation vector at time \(n\), \(C\) is the observation matrix, and \(V_n \sim \mathcal{N}(0, R)\) is the observation noise, normally distributed with mean \(0\) and covariance \(R\).
Here, we build the linear Gaussian POMP model explicitly rather than using pypomp.LG. That is in part because the full functionality is not yet in place. For example, we cannot simulate data from a POMP until the class has an rmeasure, which is not needed for filtering so has not been fully incorporated.
Set N (length of time series) to be 100 and generate linear Gaussian states and observations:
Code
# po = pypomp.LG(T=10)def get_thetas(theta): A = theta[0:4].reshape(2, 2) C = theta[4:8].reshape(2, 2) Q = theta[8:12].reshape(2, 2) R = theta[12:16].reshape(2, 2)return A, C, Q, Rdef transform_thetas(A, C, Q, R):return jnp.concatenate([A.flatten(), C.flatten(), Q.flatten(), R.flatten()])fixed =Falsekey = jax.random.PRNGKey(111)angle =0.2angle2 = angle if fixed else-0.5A = jnp.array([[jnp.cos(angle2), -jnp.sin(angle)], [jnp.sin(angle), jnp.cos(angle2)]])C = jnp.eye(2)Q = jnp.array([[1, 1e-4], [1e-4, 1]]) # 100R = jnp.array([[1, .1], [.1, 1]]) #/ 10theta = transform_thetas(A, C, Q, R)def generate_data(N, key): xs = [] ys = [] x = jnp.zeros(2)for i inrange(N): key, subkey = jax.random.split(key) x = jax.random.multivariate_normal(key=subkey, mean=A @ x, cov=Q) key, subkey = jax.random.split(key) y = jax.random.multivariate_normal(key=subkey, mean=C @ x, cov=R) xs.append(x) ys.append(y) xs = jnp.array(xs) ys = jnp.array(ys)return xs, ys, keydef custom_rinit(theta, J, covars=None):return jnp.zeros((J, 2))def custom_rproc(state, theta, key, covars=None): A, C, Q, R = get_thetas(theta) key, subkey = jax.random.split(key)return jax.random.multivariate_normal(key=subkey, mean=A @ state, cov=Q)def custom_dmeas(y, preds, theta): A, C, Q, R = get_thetas(theta)return jax.scipy.stats.multivariate_normal.logpdf(y, preds, R)rinit = custom_rinitrproc = custom_rprocdmeas = custom_dmeasrprocess = jax.vmap(custom_rproc, (0, None, 0, None))dmeasure = jax.vmap(custom_dmeas, (None, 0, None))rprocesses = jax.vmap(custom_rproc, (0, 0, 0, None))dmeasures = jax.vmap(custom_dmeas, (None, 0, 0))
Set J=10000 particles and compare the estimated log-likelihood between Kalman filtering and the log-mean-exponential computed over 100 replications for various methods, including classical particle filtering and MOP, etc.
Logmeanexp of Particle Filtering = -378.45117
SE on logmeanexp = 0.03881513
difference between Kalman-Filtering and logmeanexp of Particle Filtering = 0.048736572
We check that the difference between the Kalman filter log-likelihood and the particle filter log-likelihood is comparable to the Monte Carlo error, and that the Monte Carlo error is small as a result of many particles and many replications.
Next, we test the ouput of the MOP algorithm. The pypomp MOP algorithm is set to be \(\phi = \theta\). Under this case, the MOP algorithm should be equivalent with the particle filter algorithm, and have the same output values when setting the same random seeds.
alphas = [0, 0.1, 0.3, 0.6, 0.9, 1]key = jax.random.PRNGKey(0) # Use a fixed seed for reproducibilitysubkeys = jax.random.split(key, 100) # Pre-generate 100 keyspickle_file = out_dir +"/test-alpha.pkl"if (os.path.exists(pickle_file)):file=open(pickle_file,'rb') results = pickle.load(file) loglike_pf = pickle.load(file)else: results = []for alpha in alphas: loglike_mop = []for i, subkey inenumerate(subkeys): mop_val =-_mop_internal(theta, ys, J=J, rinit=rinit, rprocess=rprocess, dmeasure=dmeasure, covars=None, key=subkey, alpha=alpha) loglike_mop.append(mop_val) loglike_mop = jnp.array(loglike_mop) logmeanexp_val = logmeanexp(loglike_mop) difference = kf.loglikelihood(ys) - logmeanexp_val results.append((alpha, logmeanexp_val, difference))# Use the same random key to test the particle filter output loglike_pf = []for i, subkey inenumerate(subkeys): pfilter_val =-_pfilter_internal( theta, ys, J = J, rinit = rinit, rprocess = rprocess, dmeasure = dmeasure, covars =None, key= subkey, thresh =-1) loglike_pf.append(pfilter_val)file=open(pickle_file,'wb') pickle.dump(results,file) pickle.dump(loglike_pf,file)alpha_table=tabulate(results, tablefmt="grid", headers=("alpha","pf","kalman-pf"))print(alpha_table)
alpha
pf
kalman-pf
0
-378.452
0.0491028
0.1
-378.452
0.0491028
0.3
-378.452
0.0491028
0.6
-378.452
0.0491028
0.9
-378.452
0.0491333
1
-378.452
0.0491028
Consistency: For \(\alpha = 0\) and \(\alpha = 1\), the MOP logmeanexp is None and None respectively. For intermediate \(\alpha\) values, the MOP logmeanexp slightly deviates to None The Logmeanexp under \(\alpha = 0\) and \(\alpha = 1\) are closer to the Kalman-Filtering results and the particle filter logmeanexp outputs.
Test the Linear Gaussian Model: How estimate logllikehood difference and running time varys among different N and J