In this post I will show you how I built this pretty chart that depicts which party controled each house of the United States Congress (the Senate and the House of Representatives) between 1968 and 1980:
The chart displays, per year, the party that controls each house and the relative majority under which control is held. The first item is given by the color of the dots and the second by their size. Hovering over any of the dots shows this:
I got the raw data from Wikipedia, and it looks like this:
I had to format the data in a special way to build the chart, especially for the year-by-year continuity. This is because elections are not held every year but are staggered. In the Senate, seats are held for six years but elections occur every two years for approximately one-third of the seats. In the House, elections are held every two years for all members. So after some M programming and a bit of DAX I got it into this shape:
To build the scatter chart, we need two measures, one for the X axis and one for the Y axis. For the X axis, I used the Begin Year and for the Y axis the House Type (1 for Senate, 2 for House). Using Controlling Party for the legend provided the colors and the size was given by the Majority Percentage. This is the final configuration:
The last touch was to use an overlaid text box to label the final Y axis:
As a final note, the Quick Insights feature of Power BI gave me the idea for this chart:
View the complete report here.
The United States’ Social Security Administration publishes counts of the names of newborns. Kendra Little has loaded these counts for the years 1880 to 2014 (which are provided in separate files per year) into a SQL Server database that she has made available here for download. I had some Power BI fun with it exploring the most common names throughout the years.
Word cloud for the top 10 female names between 1900 and 1949:
Sample of animation for the top 5 male names from 1880 to 2014:
Download the Power BI Desktop file here (6MB). The Power BI version is 2.30.4246.181 64-bit (December 2015). It uses version 1.0.2 the WordCloud custom visual available in the Power BI Visuals Gallery. For a live version of the report, scroll to the end of this article. It was embedded using the new “publish to web” feature in Power BI.
The visuals use a ranking number assigned per year, per gender. My first version calculated the rank in DAX using RANKX but the performance was poor, especially for the animated scatter chart; it took a long time to load and refresh. I solved the problem by calculating the rank in T-SQL like this:
SELECT a.[FirstName] as [First Name]
, [ReportYear] as [Report Year]
, [NameCount] as [Name Count]
, DENSE_RANK() OVER (PARTITION BY [ReportYear],[Gender] ORDER BY [NameCount] DESC) as Ranking
FROM [ref].[FirstName] a inner join [agg].[FirstNameByYear] b
on a.FirstNameId = b.FirstNameId;
Table agg.FirstNamesByYear has about 1.8 million records.
Follow the steps below to make a pretty card:
One of the great new features in the September 2015 release of the Power BI Desktop is calculated tables. I recently used it to create a table of pharmaceutical product groups, because the group names where present in the Product table but not in a column with unique values; therefore, I could not use the column on the one-side of a 1-many relationship. I created the table Product Group with this formula:
Product Group =
I then tried to create a relationship between the resulting column, which I renamed “Group”, and the column “Brand” on the Speaker Event table:
But this is what I got:
“Missing intermediate data”? After a while, I discovered that at the very end of the Group column there was a blank value. Removing it resolved the error. I did this by changing the calculated table’s formula to this:
Product Group = FILTER(DISTINCT(VALUES('Products'[Group])),'Products'[Group] <> BLANK())
A couple of months back I wanted to do a simple map showing the location of health centers in Puerto Rico based on latitudes and longitudes, using Power View and PowerPivot. However, Power View insisted that all of the locations where off the west coast of Africa! After battling back and forth for a while and triple-checking the values directly in Bing Maps and in Google Maps, I discovered that the data types of the latitude and longitude fields had to be Decimal Number. Look at the before and after results:
I would have quickly discovered the need for the Decimal Number data type if I had tried to categorize the fields as Latitude and Longitude (respectively) in PowerPivot, as it gives this warning:
The categorization didn’t initiallly occur to me as I was telling the map chart exactly what fields to use for Latitude and Longitude; I thought that was enough indication.
The Situation in the Power BI Desktop
In the Power BI Desktop the data type expectation is immediately obvious because if the fields are text, it will force a COUNT aggregation:
This can be noticed as soon as the field is dragged to the box, but we cannot get rid of the COUNT because it lacks the familiar “Do no summarize” option that we have in Power View. Moreover, categorizing the field (as shown below) does not resolve the issue either:
Apparently the category is treated as secondary to the data type. Therefore, one must type latitude and longitude fields as Decimal Number.
Whenever working with anything geographical, maps are a cool visualization. One of the new visualization types in Power BI is the filled map, in which regions may be colored rather than just pinpointed. For example:
And like regular maps, filled maps may work as filter sources as well as filter targets. In this next image, the bar chart acts as a filter on the table and the filled map:
And in the following illustration, the filled map acts as a filter on the table and the bar chart:
And now the tip. If you copy the filled map, you get one to work as a filter on the other and therefore create simultaneous high-level and low-level views of the selected area:
This of course works with a filled map filtering a regular map and vice versa.
You can expand this to another level of geographical detail, as in this image with US counties, in which the state of Tennesse is selected in the top map:
In a future post I will work through this last example step by step. It uses diverging coloring with custom minimum and maximum population values to color the top map, and ranking in DAX to display the top 10 most populated counties.
Get a sample Power BI Desktop file here.
After using the brand-spanking-new Power BI Desktop for a couple of days, these are my favorite features so far, especially as compared to Power View and PowerPivot:
- Ease of navigation between data model and visualizations. No more switching between windows.
- One-click publishing to powerbi.com. It couldn’t be easier.
- The new visualization types and the potential market for third-party visualizations.
- The many (and growing) list of data sources.
- More control over chart customization (e.g., colors, titles, etc.). For many people this was a reason for looking at other products and a let down when compared with Excel charts.
- Seamless integration between data, data transformations, and data model. Again, no more switching between windows.
- The unity of interface between the Power BI Desktop and the report editor in powerbi.com.
- The placing of measures in the fields list of a table. I could never get my measures to look right at the bottom of the grid and was always playing with the column widths. Now measures are first-class citizens.
- The elimination of the colon for defining measures. Perhaps a minor detail, but syntax simplification and uniformity is always welcome.
- The ability to connect to and build reports from tabular models without having to import data. Another huge reason to not look at other products.
- The promise of frequent updates and improvements. This should keep the excitement going.
- The fact that the community’s input is actively being sought and paid attention to. It’s great when you submit a smile or a frown and you get a response for more information.
What’s still in Power View and PowerPivot that I soon wish to see in the Power BI Desktop, hopefully in an improved fashion:
- The play axis for scatter charts.
- More control over fonts.
- Hierarchies in the data model.
- Column filters in the data view.
- Default aggregations for fields.
- Synonyms for Q&A.