Many states use Tableau for data analysis and interactive web display. The software enables analysts to create interactive data experiences without the need for programing experience. Users can connect to a variety of existing data sources; many people start with Excel connection because it’s familiar and easy to manipulate, but for a viz that is maintained in the long term Excel has some disadvantages. Among them – typos or copy-paste errors can result in incorrect data in the file, people juggling multiple responsibilities may miss new data releases, and file versions or multiple users can introduce error. Connecting directly to a database source ensures that the viz (and any other data product) remains consistent and up to date.
Tableau allows users to connect to all different kinds of data sources. For many uses connecting directly to the WID (or database views on the WID) solves most of these problems – doing so keeps the source data in a predictable format and prevents incorrect values from showing up in the end product. Maintenance is reduced to keeping track of release dates and refreshing the connection at appropriate times.
There are plenty of times analysts may have reason to use data that’s not stored in the database, though. In that case, there’s an alternative way to get the benefits of connecting to the database while expanding beyond the data that’s internally available. Users can connect directly to publicly available data sources on the internet.
Web Data Connectors (WDC)
Data sources on the internet can come in a variety of formats and structures. They can be XML or JSON, secured or unsecured, single- or multi-table. While Tableau allows you to connect to most types, it needs the structure to be defined before it can do so. The format it uses to do so is called a Web Data Connector (WDC), and is available on the Data Sources screen under more on the “Other Database Connections” section of the menu.
Several of these exist and are publicly available on the Tableau user forums. That list doesn’t include BLS, BEA, Census, or CareerOneStop data connectors, though, which are the large sources of data that are most likely to be of use to Labor Market Information offices.
What does the WDC do?
Some notes about the WDC: All the connector does is define a relationship between Tableau and another data source. It doesn’t control either of those – the data still lives and is maintained elsewhere. Server and firewall settings for either the user or the source data can affect the connection and authentication – passwords and user keys – continue to be managed by the data source. Should the structure or content of the data change, the end user still needs to be aware of those changes and change their use of it appropriately.
Once the data structure is defined in the WDC, Tableau reaches out and pulls the data into its own format. It requires an extract for this kind of source, meaning that you’re copying it to a local destination and will need to refresh the connection each time there’s a new data release.
How can I use a WDC?
The ARC has built a handful of WDCs for connecting to federal data sources containing LMI data. This doesn’t cover every useful federal data source and there may be design choices that could be refined. The primary goal of this effort was to create enough different types of connectors to allow states to test the technology and let us know if it’s valuable and should be expanded or done differently. This is new territory for the ARC. While the source APIs are pretty stable and changes are not anticipated to be frequent or disruptive, there may be unforeseen maintenance issues. Test them and let us know about issues, and if you consider them valuable let us know that, too – we’ll take it under consideration going forward.
Both BEA and BLS APIs require the use of signing keys. These can be requested and are granted in an automated process, but that’s something each state will have to get on their own.
There is a sample WDC in the directory that returns earthquake data and can be used for testing before a state has requested a key to one of the others.
Some states have more restrictive policies than others and this process accesses two files on the ARC file server (data.widcenter.org), then passes them through Tableau and hits a federal website. How that’s handled state-to-state may vary and if it’s insurmountable please let us know!
During internal testing we discovered that the WDCs require relatively current versions of Tableau. If yours is out of date it may block the connection.
Although I’ve set these up to be available to everyone, you may find your state IT environment or the goals of your office would make it more useful to have your own versions. If you’d like a copy of the files let us know.
Sometimes the initial connection is slow. Once the extract is created that is no longer a problem, but it’s recommended to start with a small amount of data (a single series, a start date only a few years back) to get a feel for the process.
When a connection is made to a data set via WDC the results are returned and manipulated like any other table. However, if the connection needs to be refreshed or modified with a new year or other minor parameter change, the previous values are not saved. Recording the values used for the connection is necessary.
Depending on the type of failure, Tableau may not give meaningful (or any) error message. When connecting to the Census API without a key, if you hit your call limit, it returns an empty record set rather than a notice that you’ve hit your limit. It can make it difficult to troubleshoot.
Because the WDC is connecting to a data source that is not controlled by the ARC, information about the data source can be obtained from the source’s documentation, which is linked to on the initial parameter input screen of each WDC. There are also notes about what the input fields are expecting on that screen, but if they’re not obvious or more clarification is necessary feel free to ask.
Connect to a WDC
Open Tableau. On the initial data connections screen select Connect>To a Server>Web Data Connector. The following opens up:
Where it says “Enter your web data connector URL here” enter the path to the connector you want to use.
Sample USGS Earthquake data (doesn’t require an API key): http://data.widcenter.org/wfinfodb/Tableau/WDC/earthquakeUSGS.html
ACS variable descriptions:
Comparison Profile: http://data.widcenter.org/wfinfodb/Tableau/WDC/variablecomparisonACS.html
BLS single series: http://data.widcenter.org/wfinfodb/Tableau/WDC/singleseriesBLS.html
BLS multiple series (same as above, but allows comma-delimited list of series ids): http://data.widcenter.org/wfinfodb/Tableau/WDC/multipleseriesBLS.html
BEA Regional Income: http://data.widcenter.org/wfinfodb/Tableau/WDC/regionalincomeBEA.html
A list of popular BLS series ids can be found here: https://api.bls.gov/publicAPI/v2/timeseries/popular. You may also already have some saved that you use in the series report query on the BLS website.
Once you’ve entered the path to the connector, hit enter. You’ll come up with a form that’s specific to the data source. Enter an appropriate value in each box and hit connect. Once submitted, it will take you to the data connection screen and if you’re a frequent Tableau user it should be familiar from there.
Look at Tableau documentation here: https://onlinehelp.tableau.com/current/pro/desktop/en-us/examples_web_data_connector.html