Abstract
To characterize the Kullback-Leibler divergence and Fisher information in general parametrized hidden Markov models, in this paper, we first show that the log likelihood and its derivatives can be represented as an additive functional of a Markovian iterated function system, and then provide explicit characterizations of these two quantities through this representation. Moreover, we show that Kullback-Leibler divergence can be locally approximated by a quadratic function determined by the Fisher information. Results relating to the Cram é r-Rao lower bound and the Ha j́ ek-Le Cam local asymptotic minimax theorem are also given. As an application of our results, we provide a theoretical justification of using Akaike information criterion (AIC) model selection in general hidden Markov models. Last, we study three concrete models: a Gaussian vector autoregressive- moving average model of order (p, q), recurrent neural networks, and temporal restricted Boltzmann machine, to illustrate our theory.