Open Communities and Lucrative Networks

I’ve been a Deadhead for over thirty years. One of the things that made the band so easy to get into was the fact that their concert recordings – thousands of them – were freely traded among fans. I had boxes filled with cassettes that allowed me to experience over a quarter century of performances. This also served as an entry point for conversations with other fans that deepened my relationship to the music and the larger community. B-school grads would refer to this as a “network effect.”

Those same B-school folks may also question the wisdom of giving away products to your customers for free. In my case, and for many others, this only enhanced Grateful Dead’s bottom line. When high quality recordings of classic shows like Kesey’s Farm ‘72, or Cornell ’77 became available, I knew I had to have them because I’d already heard how good the music was! Moreover, I knew that listening to a recording of a concert couldn’t possibly compete with the experience of actually being there. I went to as many shows as I could.

That same thrill of digging through stacks of music happens to me when I’m looking through research and continuing education content on the CAS website. It’s a rich trove of knowledge and experience that’s even more expansive than concert recordings. I mean, the Grateful Dead have a vast catalog, but over a century of research? That’s deep. And just as I get a kick out of being able to consider EVERY version of “Fire on the Mountain” from May ’77, I love poring over all the work on reserving from the 2006 E-forum. As with music, the availability of open research hasn’t kept me from amassing a sizeable collection of commercial textbooks. I even pay money for some texts that I could access on the internet (“An Introduction to Statistical Learning”, “Advanced R”, etc.).

What I’m saying is that it just makes sense to keep some things open and available. Consider Google. Google made its name in internet search without charging anyone a nickel. Think about it: a human activity so useful, so ubiquitous and so necessary that it changed the vocabulary of most languages on Earth. And yet consumers never made any direct payment to the company that developed and provided it! That feels like sort of a big deal. There are loads of other examples of tools and software that are available to users free of charge. Apache, RStudio, Python, React, PostgreSQL, Ubuntu, git, Django, OpenOffice, etc. It’s fair to say that I wouldn’t be able to get any work done without them.

The CAS is obviously not Google, but we are working to create an ecosystem and a community that benefits from open access and a liberal flow of knowledge. As I said above, over a hundred years of research is out there. We also continue to have an open source software discussion group. A great deal of our continuing education uses open source tools, and we prioritize that for our funded research projects (which you can read for free). To further support this, we launched a GitHub organizational site last year. This happened after lengthy discussions with legal counsel about how to license the work in such a way that contributions could be freely made and shared, while still preserving an incentive for insurers to incorporate software into their closed, proprietary systems. Just like bootleg tapes and store-bought music, each item supports the other.

There’s an additional step that we’re trying to take. The CAS has pursued publicly sharable data for years. Quite a few years ago, we supported the curation of Schedule P reserving data, which has served as the starting point for quite a few research articles and continuing education content. Want to look at it? Click here, or check out the `raw` R package, available on CRAN.

We’re doing what we can with what’s already out there, but we want to take it even further. This means approaching insurers or insurance data providers to procure data sources for research and education. We would obviously take every precaution to ensure that no personally identifiable information is received or shared by the CAS. Even with those sorts of safeguards in place, companies remain reluctant to part with their data for public research, and we’ve only been able to secure one such data set. We get it. If data is the new oil, then it represents a commodity with a market value. Companies aren’t keen to part with it without a tangible return benefit.

We’ll continue to pursue that avenue, but we’re opening up to other approaches. This year, we’re contracting with companies who will provide us with data for publicly available research. That data will not be shared beyond the CAS staff actuaries and volunteers who agree to keep the data confidential. We intend to release the code used in the analysis so that actuaries can follow the research line by line. Where possible, we will also share small, exemplary – possibly synthesized – data sets so that actuaries can confirm that the code works and enhance their understanding of the methodologies. They can take those scripts and try them on their own data.

I’m excited about what’s next for research and I hope you are too. If you’ve got data, code, research papers or crunchy soundboards that you’d like to tell me about, by all means, drop me a line:


About Brian Fannin

Brian Fannin is a research actuary at CAS. He is also the founder of PirateGrunt LLC, a boutique consulting firm based in Durham, North Carolina specializing in predictive modeling in the property casualty markets. Fannin has been an Associate of the CAS since 2002 and a Certified Specialist in Predictive Analytics (CSPA) through The CAS Institute since 2017.