Callysto.ca Banner

Open in Callysto

Music is ubiquitous

Music is one of the ways we humans express ourselves. Not only can music help form your identity, it has also helps people to connect in this globalizing world. Sharing an interest for a particular kind of music has allowed artists to travel across the globe.

With online streaming platforms it has become easier ever to listen music from artists from different countries. Let’s analyse the dataset from Spotify, an audio streaming platform, which is available on Kaggle.

The original dataset (with more than 3 million rows!) contained information about top 200 artists in daily charts in various countries for 2017. It has been modified to contain only top 10 artists which reduced the number of rows to 191848, still a big dataset to work with. Als, note that the top artists were selected based on the streams of their tracks i.e. number of times songs were played by Spotify users.

Popularity of an artist across the globe

Let’s check how many times tracks from a user-selected artist were streamed on Spotify.

Run the code cells below and check what our dataset contains.

# You can comment out the following line if the module is already installed
!pip install wikipedia

# Import Python libraries
import pandas as pd
import calendar
import plotly.express as px
import plotly.graph_objects as go
from ipywidgets import interact, fixed, widgets, Layout, Button, Box, fixed, HBox, VBox
import wikipedia
from IPython.display import clear_output
import random

# Don't show warnings in output
import warnings
warnings.filterwarnings('ignore')
clear_output()

print('Libraries successfully imported.')
# Import the dataset and remove rows with missing entries
df = pd.read_csv('./Data/top_10_artists_spotify_2017_final.csv').dropna()

# Import country codes (required for the choropleth map)
country_codes = pd.read_csv('./Data/country_codes.csv',sep='\t')
country_codes['2let'] = country_codes['2let'].str.lower()   # Convert country codes to lower case

# Display the top 5 rows
df.head()
# Define a callback function for "Show Popularity" button
def show_popularity(ev):
    clear_output(wait=True)
    
    # Define display order for the buttons and menus
    display(Box(children = [artist_menu], layout = box_layout))
    display(Box(children = [show_button], layout = Layout(display= 'flex', flex_flow= 'row', align_items= 'center', width='100%', justify_content = 'center')))

    # Find total streams for a user-selected artist across countries
    subset = df[df['Artist'] == artist_menu.value].groupby('Region')
    total_streams = subset['Streams'].sum().to_frame('Streams').reset_index()
    
    # Merge the 3 letter country codes (required in plotly express) with the data
    final = total_streams.merge(country_codes, left_on='Region', right_on='2let', how='inner')\
            .drop('2let',1)\
            .rename(columns={'3let':'Country Code'})
    
    # Find the wikipedia page for the artist 
    try:
        p = wikipedia.page(artist_menu.value)
    # If can't find the exact page, get the closest one    
    except wikipedia.exceptions.DisambiguationError as e:
        s = e.options[0]
        p = wikipedia.page(s)
    
    # Plot the choropleth map for the user-selected artist
    fig = px.choropleth(final,   # dataframe with required data 
                    locations="Country Code",   # Column containing country codes
                    color="Streams",   # Color of country should be based on number of streams
                    hover_name="Countrylet", # title to add to hover information
                    hover_data=["Streams"],   # data to add to hover information
                    projection='natural earth',   # preferred view of choropleth map
                    
                    # Add title for the map (hyperlinks for wiki page of artist and dataset are added)    
                    title = 'Popularity of<a href="{}" > {} </a>by streams of tracks in daily Top 10 chart (2017)<br>Source: \
<a href="https://www.kaggle.com/edumucelli/spotifys-worldwide-daily-song-ranking/version/3">\
Spotify through Kaggle Datasets</a>'.format(p.url,artist_menu.value))
    fig.update_layout(geo=dict(showcountries=True))
    # Show the figure
    fig.show()
print('Defined the show_popularity function.')

Run the code cell below and select the artist you want to analyze popularity of. Don’t forget to click on Show Popularity button.

# Layout for widgets
box_layout = Layout(display='flex', flex_flow='row', align_items='center', width='100%', justify_content = 'center')
style = {'description_width': 'initial'}

# Create dropdown menu for Artist
artist_menu = widgets.Dropdown(options = df['Artist'].sort_values().unique(), description ='Artist: ', style = style, disabled=False)

# Create Show Popularity button and define click events
show_button = widgets.Button(button_style= 'info', description="Show Popularity")
show_button.on_click(show_popularity)

# Define display order for the buttons and menus
display(Box(children = [artist_menu], layout = box_layout))
display(Box(children = [show_button], layout = Layout(display= 'flex', flex_flow= 'row', align_items= 'center', width='100%', justify_content = 'center')))

Try it for various artists. If you don’t know much about the artist, click the link in the title of the map which will take you to the Wikipedia page of the artist. Hover your mouse over different countries and see how many times tracks were streamed.

Questions:

  1. Which artists were popular in North America, Europe, as well as the Asia-Pacific region?

  2. Share your thoughts on how music (and media in general) has brought people together and become the globalizing force.