For any time series analysis, one needs a signal; for our purpose we will use numbers which can be written down as sum of two square as signal. So the signal looks like
x={0,1,2,4,5,8,9,10,13,16,17,18,20,25,26,29,32,34,...}
But this sequence is increasing, so we will not look at this, but the difference of the consecutive elements (y(t)=x(t)-x(t-1)), so above data changes to
y={1,1,2,1,3,1,1,3,3,1,1,2,5,1,3,3,2,...}
Let us start with some basic analysis of the signal. A simple check shows that the maximum grows as log(t), so even though many of our analysis goes upto 16 million time steps, the maximum it reaches is only about 63. For first let us look at histogram data, i.e h[n]=#{t:y(t)=n}. In the image below, top left is plot of signal for thousand steps in time, and bottom left is its histogram distribution (for about ten million time steps averaged over the thousand intervals to get central value and a standard deviation)
At beginning it may seem that using Fast Fourier Transform is good idea, the result is practically noise (top right is FFT result for one million time step, and bottom right is histogram distribution of intensity for the frequency distribution). From the bottom graph, it can be seen that the intensity distribution matches distribution of intensity of Gaussian noise. So any hope of using Fourier transform is gone. As of now the signal seems to be random.
Now let us generate a new time series from the original time series y, let's define zk(t) as
zk(t)=1 if y(t)=k else 0
We will take k=3 and so we have
z3={0,0,0,0,1,0,0,1,1,0,0,0,0,0,1,1,0,...}
From this we generate another gap distribution which we call yk(t) (leaving the initial zeros)
y3={3,1,6,1,4,8,1,2,7,7,2,2,4,1,11,5,9,2,...}
and doing similar analysis as above we get:
It can be seen that the data still look similar to previous data with similar characteristics (though histogram distribution is much better). But still there is no hope of getting any prediction using these. We can repeat the above process of generating new time series from y3 as we did for y, but result again seems to be same. If we continue with similar process as above for other value, we get similar result only signifying that the data is self similar.
Because of the self-similarity we can try to see if there is some kind of correlation, that may actually depend on many variable (like markov process). So we are trying to compute
P(y(t)| y(t-1),y(t-2),...,y(t-k))
and see we can come up with a k such that above probability changes to dirac distribution. This analysis is done for the time series of length 16,649,376. Because of this we cannot go too high in k; we take k from 2 to 8, but they are not enough to get the desired result.
We first create a hash function by using p(t)=64ky(t-k)+64k-1y(t-k+1)...+y(t) (the maximum value of y(t) obtained is 63) and look if there exists any f such that y(t+1)=f(p(t)). So if for large number of points, this kind of thing happens, we have some hope that eventually we may get a relation. In the graphs below, such any such hypothesis is negated. In the left graph, x-axis is number of y(t+1) given p(t) is fixed, and y-axis is average number of time in the time series we get p(t). It can be seen that on average, each time one of the p(t) is reached, all possible y(t+1) is obtained with equal probability. On right graph, x-axis is chain length of the Markov chain (that is k) and y-axis is maximum number of y(t+1) for p(t) fixed for the chain.
These analysis shows that the time series y(t) can be used as random signal (upto some degree).