It’s a good time to reconnoiter and review the things I have in progress and are planned, both as a teaser, and as a promise. I am currently working the following technical projects, entirely on my personal time outside of work. The subject material has also been carefully chosen so it does not infringe in any manner upon the subject material of my employer.
Update, 12th September 2015, 14:28 EDT
- Developing state-space models of time series where observations are censored at periodic intervals. This is intended to be in support of environmental time series, principally time series where trained volunteers are enlisted to collect and record data for analysis. (See also.) Two important applications are being pursued. First, there is a study of the Town of Sharon’s water supplies, and the forcing functions from precipitation through stream levels which dictate well levels. This involves hydrology, which I am tooling on, as well as numerical maths involving the dlm and KFAS packages of R. The trick is that expressions of precipitation and other flows are likely to be lagged and modulated realizations of them, but that while precipitation is available all year, stream levels and well levels are poorly measured, at times with gross errors. These are simply because of the nature of the measurement. Second, Buzzards Bay near Cape Cod is being monitored by a network of volunteers with the guidance and direction of Woods Hole Oceanographic Institution (WHOI), led by Jennie Rheuban, Research Associate, Marine Chemistry & Geochemistry.
- Documenting a technique for using a Bayesian sensitivity analysis of the network corresponding to human vascular blood flow to use differential counts at multiple sites of ctDNA to help isolate resurgent tumors. This is simply an application of eigendecomposition of such a network, with sites identified as components in its state description, and using a Bayesian inversion to hindcast the location of a source of ctDNA. The idea is that this works because ctDNA from such sites has a short half-life in bloodstream.
- (Update.) Paul Pukite and Graham Jones have been developing a characterization of the ENSO which involves a physical sloshing model. They’re seeking help in model evaluation and assessment of time series. I’ve decided to pitch in. This is another Azimuth Project effort. See the following links:
- https://forum.azimuthproject.org/discussion/comment/14860/#Comment_14860
- https://forum.azimuthproject.org/discussion/comment/14853/
- http://modules.lossofgenerality.com/
- https://johncarlosbaez.wordpress.com/2014/06/20/el-nino-project-part-1/
- https://forum.azimuthproject.org/discussion/1480/tidal-records-and-enso
- https://forum.azimuthproject.org/discussion/1415/some-el-nino-related-data
- http://dhivehi.wunderground.com/blog/PaulPukite/show.html
- Developing and popularizing big data versions of the Singular Value Decomposition (SVD), which is the key component for doing Recommender Systems and many other latent space analyses, including multidimensional scaling. There’s little new needed here, apart from invoking the couple of packages available in R for doing these decompositions, and applying them to several cases of interest. These include mining large sets of environmental time series for latent behaviors. I have pretty much given up on Python use here, even if Python is a promising language. My encounter with Python is less the technical aspects of it and more its sociology, which seems to consist of people who really don’t appreciate the wonders that can be had by working much smarter. Moreover, R is getting better every day.
- Somewhat related to the SVD work, developing and popularizing big data versions of state-space analysis techniques, like those provided in the dlm and KFAS packages of R. In fact there is a connection, for dlm does what it does because it expresses the Kalman filtering and Rauch-Tung-Striebel smoothing algorithms in terms of an SVD.
That’s the menu.
Update, 21st December 2015
An update on the key problem concerning the water supply in the Town of Sharon by Paul Lauenstein can be found here.
Pingback: A look at an electricity consumption series using SNCDs for clustering | Hypergeometric
Pingback: Is the answer to the democratization of Science doing more Citizen Science? | Hypergeometric