Faculty Sponsor

Dr. Michelle DeDeo

Faculty Sponsor College

College of Arts and Sciences

Faculty Sponsor Department

Mathematics & Statistics

Location

SOARS Virtual Conference

Presentation Website

https://unfsoars.domains.unf.edu/2021/posters/the-opioid-epidemic-on-the-first-coast/

Keywords

SOARS (Conference) (2021 : University of North Florida) – Archives; SOARS (Conference) (2021 : University of North Florida) – Posters; University of North Florida -- Students -- Research – Posters; University of North Florida. Office of Undergraduate Research; University of North Florida. Graduate School; College students – Research -- Florida – Jacksonville – Posters; University of North Florida – Undergraduates -- Research – Posters; University of North Florida. Department of Mathematics and Statistics -- Research – Posters; Project of Merit Award Winner

Abstract

Project of Merit Winner

The nation has been focusing on the opioid epidemic for many years. Aggregate quarterly data on opioid distribution at a general level has been available through the Drug Enforcement Administration (DEA) but cannot be used to do analyses on the effects of opioids in local areas. Quantifying impacts of the opioid epidemic at the local level has never been easy: what little data was provided by the DEA was not user-friendly, overly broad and did not follow the desired timeline of data collection. This project focuses on database exploration and uses statistical methods and Decision Tree analysis to predict the expected annual opioid saturation across Northeast Florida. These analyses can support the community, healthcare systems and public servants in addressing problems surrounding opioid sales and abuse in hopes of finding solutions targeted to specific areas of the First Coast which includes Baker, Clay, Duval, Nassau, Putnam, and Saint Johns counties.  The project consists of four objectives: database exploration to access and extract relevant data from multiple sources; data cleaning which involved deciphering and connecting data sets; statistical analysis utilizing estimated means and decision trees; and the creation of an interactive dashboard which also provides downloadable data for public use which is available on Tableau Public. Surprisingly, the estimated mean pill counts post-2011 are all higher than before legislation aimed at reducing opioid abuse went into effect. We conclude that although the total pill sales have decreased, per person pill sales have risen.

Comments

Audio Presentation Transcript:

The Opioid Epidemic on the First Coast
(video transcript)

Hello and welcome to our presentation on the Opioid Epidemic on the First Coast. Our team is led by Dr. Michelle DeDeo and the undergraduate students are Jeremiah Baclig, Noah DeDeo, Rukhaiya Husain, and Iliya Kulbaka.

This presentation is part of a larger project that focuses on identifying statistical relationships between the Opioid Epidemic and socioeconomic variables. We applied statistical methods on data recently made public by the Drug Enforcement Administration (or DEA) through their Automated Reports and Consolidated Ordering System (ARCOS).

According to DEA, more than 30 million transactions are reported each year into the ARCOS database, which is a comprehensive drug reporting system that monitors the flow of controlled substances from their point of manufacture to the dispensing level. The complexity and nuance of the opioid crisis spans public and private sectors that operate in opaque environments with limited guidance.

Assessing public health impacts not only requires access to this data, but careful, academic analysis. The results of this project are, and will be made, available to the public to support understanding and data-driven decisions in the community and in the State. We also explore the effect of Florida’s 2011 “Pill Mill” Laws. By 2010, 90 of the top 100 doctors prescribing opioid medications were practicing in Florida. Subsequently, three regulatory laws were passed in 2010 and 2011 to prevent “pill mills” in the state. The pill mills were pain management clinics that were prescribing opioids at an alarming rate. The sale of opioids before and after the passage of these laws went through drastic change, but the true effect of this change in local communities has been hard to visualize.

Pre- and post-analyses of the impact of the bill are invaluable.

Our team accessed the relevant ARCOS transaction data from 2006 through 2014, identified limitations in the database, found database “quirks”, cleaned and connected to multiple databases, and found statistically relevant relationships between opioid distribution and socioeconomic characteristics in Northeast Florida. Furthermore, these relationships and findings were used to create an online interactive map (that we will share with you all in a moment).

Why is our project special? The reasons are 4-fold:

The first being is that our team is a mix of disciplines: Data Science and computer science majors for Python skills, statistics majors for JMP and SAS packages to run statistical tests, and Biology majors to use GIS and Tableau for mapping.

The second reason is that it’s a regional project. Our project focuses on the First Coast Region only being: Baker, Clay, Duval, Nassau, Putnam, and St. Johns county.

Our third is that the results are significant because the Opioid crisis is ongoing,
And our fourth is that it is innovative! No one has yet successfully analyzed the ARCOS data as it was recently released.

There were 4 main databases our team explored: The first was the ARCOS database. This is the DEA’s Automation of Reports & Consolidated Orders System, the next was the DEA’s Quarterly Retail Drug Summary reports, The Census & ACS American Community Survey, and RUCA & RUCC from the USDA Bureau of Economic Research were also used. Let’s go over these individually.

The 9-years’ worth of ARCOS data comprehensively painted a picture of the crisis. Previous data available only used measures that served as proxies for different aspects of it like Medicare Part D, mortality, etc. The ARCOS data was released over a year ago, but deciphering it was very difficult.

Until recently, the ARCOS database was only accessible to the DEA, the Justice Department and other related agencies. (recently being July 2019 & Dec 2019). After a long legal battle in federal court brought by the Washington Post and HD Media, the Washington Post was able to release a significant portion of a database. Raw ARCOS data is on Github. Initially only the years 2006-2012 was released. Manufacturers argued that releasing later years would reveal proprietary sales data to their competitors. The courts settled on releasing 2013 & 14 data and the additional data was made available in 2020.

Hello everyone, my name is Iliya Kulbaka and in this slide I am introducing you to the DEA Quarterly reports. Unlike the ARCOS data which mentions details down to the address of a pharmacy, DEA also publishes quarterly reports which only capture a broad snapshot of the transactions. The quantities are all in terms of MME which are Morphine Milligram Equivalents and are not recorded as pills or liquids as they can be dispensed in different strengths. As you can see in this snapshot of the data, summaries are provided only for the first 3 digits of a zip code. To put this into perspective, every zip code in all of Duval county begins with 322, but there also zips in St Johns that begin with 322. In addition, zips that are consecutive in order may be far apart geographically. For example, 32086 is St Johns County and 32087 is Baker County.

The annual American Community Survey or ACS is like the 10-year census but is longer and not required by every household to complete. It provides important social and economic measures necessary for our analysis. We used this database to determine population as well as poverty status of the populations within each zip code.

The USDA Agricultural Research Service has RUCA codes (or Rural-Urban Commuting Area codes) that classify zip codes based on the urbanization, population density, and daily commuting patterns of its population. Another measure, the Rural-Urban Continuum or (RUCC) is a county-wide classification based on the degree of urbanization and adjacency to a metro area. These two codes were essential to our analyses as we will explain. These values are whole numbers (1-10) delineate metropolitan, micropolitan, small town, and rural areas.

Now we get to Data Extraction and Cleaning where we extract data from ARCOS databases, ACS, RUCA, and RUCC. After that we process and use it for our analysis purposes.

Throughout the process of data extraction and cleaning, we faced a few issues. One of those issues being ARCOS database. The ARCOS data set was over 130GB of uncleaned and raw data, but to help us figure that data out, there was an ARCOS Handbook which was 191 pages long and was last updated in 1997.

Although there were several news publications and articles to help figure out ARCOS data, there was still no clear way of deciphering ARCOS database. and publications have used the raw data, a methodology for deciphering the ARCOS data accurately was unknown. Multiple shipments or shipments within shipments. Transaction type issues such as sales, returns, disposals, transfers, missing or lost shipments and cancellations which at times were reported much later. DEA actions – such as deletions, additions, insertions of late data and Incorrect Reporting of NDC or National Drug Codes and much more.

For example, there are currently 15 steps in the data validation process. Issues with the data quantities also arose. Where in Florida data for zip codes beginning with 32, over 15 million records and 43 variables were found. This resulted in over 64 million data points. However, we did not include data on liquid or caplet forms of opioids, and 10 other opioid types, because they were shipped in much lower quantities. So initially, with over 4 million records we were able to strip it down to 1.4 million records that consists of all sales of oxycodone and hydrocodone pills on the First Coast. With this data, we were successful at cross-checking it with DEA Quarterly reports for each quarter.

In addition to the issues which were brought up by the ARCOS database, the DEA Quarterly reports also had their own “quirks”.

When extracting data for Florida zip codes, we faced an issue where data gathering manually became way too cumbersome. DEA Quarterly reports suffered from inconsistent data logging, which required manual data set editing. For that reason, a Python script was constructed to read the data files, locate Florida data for a particular drug code, and extract quarterly quantities in grams.

Hello, everybody. My name is Jeremiah and within this slide, we refer to the verification of the ARCOS data to the DEA Quarterly reports. This validation was calculated by MMEs, as introduced earlier by Iliya. The ARCOS needed to be adjusted to match the format of the quarterly reports, and this meant that the approximately 1.4 million individual shipments to pharmacies and retailers had to be compiled down to the 36 data points represented by the quarter and the first 3 digits of the zip code found in the quarterly reports over the 9-year span. A python script was created to iterate through the ARCOS data, compile it by quarters based on the specific dates, and then output it in an equal fashion. As seen in the graph, there are slight differences in the values, and these differences can be attributed to issues with retailers reporting the shipments to the DEA and delays. Overall, over 680 million pills were delivered into NE Florida during this time.

Although the data available is extensive and relatively easy to understand, the setup of the database retrieval interface is not without its issues. The Census interface can be puzzling to navigate when first looking into it, making mass retrieval feel clumsy at times. As for entries being missing, the 2016 population data was entirely absent, making the averages of the 2015 and 2017 data the ones to cover for it. Lastly in order to find data specific to the First Coast, individual zip codes needed to be selected. Care was taken to also include two non-conforming zip codes (32666 and 32656) because they did not follow the 320, 321, 322 initial 3 digits like the rest of the First Coast.

There are 60 zip codes grouped into 20 commonly known Area names. The regions were chosen by proximity and access to their nearest neighbors labeled by their common Area names. Both NAS JAX and NAS Mayport were removed as they accounted for less than 2% of all sales and as they service the military for the entire North Florida area. In generating the regions for analysis, we created an Adjusted RUCA score for each region using two classification schemes (the RUCA and the RUCC) and two balancers (for areas with excessive poverty and areas with a disproportioned percentage of people over 65).

When calculating the adjusted RUCA as seen in the flowchart, if the RUCA value was greater than or equal to the 2013 RUCC, we used the RUCA value. In 17 of the 19 areas, this was the case. If the 2013 RUCC was greater than the RUCA, we took the average of the 2003 and 2013 RUCC. This occurred in the other 2 areas: Macclenny and Outer Palatka. For the population living below the poverty level, we added 1 point for every 50% increase above the NFL average of the population below poverty. This occurred in 1 area: Callahan which added up to 2pts. For the population over 65, we added 1 point for every 3% increase above the NFL average of the population over 65. This occurred in 3 areas: Fernandina which was 3pts, SW Clay which was also 3pts, and Outer Palatka which was 5pts. The maximum number of points an area could accumulate was 6.

After considering all of the mentioned variables, the areas were then ranked by Opioid Saturation per person which was calculated by the total number of pills sold over time by Buyers in each zip code divided by the total population of each zip code (using the ACS 5-year average population between 2008 and 2012). Here, the color and height correspond to high pill counts per person and the circles represent the adjusted RUCA scores. Notice where our Rural Area, Outer Palatka and Micropolitan, Palatka lie on the graph compared to Downtown where we have big hospital systems whose patients may fill their discharge medications at nearby retailers out of convenience. Also, these are 9-year totals, so those of you familiar with Nocatee can understand why the pill count is so low since the population only recently started booming. Now, the ranked graph shows which Areas sold the most pills geographically; however, this does not capture an important event in time: the passing of Florida’s Pill Mill Laws of 2011.

To estimate the impact that the Pill Mill Laws had on the concentration of Opioids sold in each area per person annually, a Decision Tree Analysis was run on the areas in three different time periods. We’ll start with this overall summary of the 9 years of ARCOS data. A decision tree was used because with only 20 areas, our data set has few observations, and these trees don’t require assumptions of large sample sizes to be met Further, we have relatively low R-square values but regardless, they paint a general picture of the crisis.

We first find the annual mean pill count per person[click] and find the factor (in our case, either adjusted RUCA score or level of Poverty) that splits the Areas with the most variance. Splitting at adjusted RUCA, we get the next two grouped means. Micropolitan, with an adjusted RUCA score of 4, is set at a mean pill count of 78.5 per person. We split again on the adjusted RUCA score and variance stabilizes, and we are left with an expected annual pill count per person of 36.7 for Metro Area and Rural areas and 55.6 pills for the actual metropolitan. Interestingly, poverty as a factor did not significantly predict annual pill counts for this analysis.

To more easily visualize those expected means, we have this handy map. Notice the blue metropolitan area has much lower annual opioids per person than those areas mentioned earlier: Remember Palatka was number 1 that’s the red region in the south and Macclenny farthest west.

Similarly, we also analyzed the data before and after January 1, 2012, the date when Florida’s Pill Mill Law, passed in 2011, went into effect to see how opioid concentration per person changed.

The pre-2012 analysis consists of 6 years of data while the post-2011 consists of 3 years. What is especially fascinating about these estimated numbers is that they are all higher after than before the Pill Mill Laws were enacted.

Essentially, what we can observe is that although the total amount of pills sold has decreased over time, the total amount of pills sold per person in Northeast Florida has still managed to increase despite the State’s efforts in passing the Pill Mill Laws because, although mean pills counts have gone down from the peak of the crisis—remember this handy visual over time— they have not returned to pre-crisis levels.

In the following slide, I’ll introduce the online, interactive dashboard that was created to visualize the data. By using Tableau, it gave us the ability to filter on-demand by region, zip, adjusted RUCA, and custom area. Simple clicks allow you to dig deeper into the underlying data and lastly, this created a publicly available, easily downloadable, data resource.

This is the Public Tableau dashboard. The elements in each graph have differing functions in terms of visualization, and this data is fully accessible in terms of filtering what you would want to analyze, be it by region, timeframe, etc. I’ll start off by taking a look at the map on the left. As you can see, the colors dictate the different common area names in NE Florida, and the size of the circles shows the pill concentration in that area. As I zoom in, I’m able to hover over the circles and see the following information. If I would like to view it in a text format in my browser, I can simply click on it, click on view data, and I can view it here. I can also press this button here to view the data, and even download for myself in a csv format. You can also drag and select multiple areas and view the details for those. Now let’s take a look at the two charts on the right side. This chart goes over the average pills per month per person by each region. The colors correspond to the legend on the left side. The cool thing about this is how you can easily click one of the regions, (I just chose the Westside) and it will highlight for me that region on the map. As you can see, it will reflect the changes on the bottom right graph too. Once again, you can hover over the data points to see the exact details. Speaking of the bottom right graph, this particular time series has the feature to play through it. So we can see the transformation of the pills over time from the start. And if we would like to reset our parameters, press this button here. Now let’s do this from the top.

Streaming Media 1

Rights Statement

http://rightsstatements.org/vocab/InC/1.0/

Included in

Mathematics Commons

Share

COinS
 
Apr 7th, 12:00 AM Apr 7th, 12:00 AM

Opioid Epidemic on the First Coast​

SOARS Virtual Conference

Project of Merit Winner

The nation has been focusing on the opioid epidemic for many years. Aggregate quarterly data on opioid distribution at a general level has been available through the Drug Enforcement Administration (DEA) but cannot be used to do analyses on the effects of opioids in local areas. Quantifying impacts of the opioid epidemic at the local level has never been easy: what little data was provided by the DEA was not user-friendly, overly broad and did not follow the desired timeline of data collection. This project focuses on database exploration and uses statistical methods and Decision Tree analysis to predict the expected annual opioid saturation across Northeast Florida. These analyses can support the community, healthcare systems and public servants in addressing problems surrounding opioid sales and abuse in hopes of finding solutions targeted to specific areas of the First Coast which includes Baker, Clay, Duval, Nassau, Putnam, and Saint Johns counties.  The project consists of four objectives: database exploration to access and extract relevant data from multiple sources; data cleaning which involved deciphering and connecting data sets; statistical analysis utilizing estimated means and decision trees; and the creation of an interactive dashboard which also provides downloadable data for public use which is available on Tableau Public. Surprisingly, the estimated mean pill counts post-2011 are all higher than before legislation aimed at reducing opioid abuse went into effect. We conclude that although the total pill sales have decreased, per person pill sales have risen.

https://digitalcommons.unf.edu/soars/2021/spring_2021/12

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.