Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python Introduction to pandas Exploring pandas Optional Challenge #3 - Verified Email List

Sebastiaan van Vugt
seal-mask
.a{fill-rule:evenodd;}techdegree seal-36
Sebastiaan van Vugt
Python Development Techdegree Graduate 13,554 Points

A value is trying to be set on a copy of a slice from a DataFrame

In 2 different situations I get the warning "A value is trying to be set on a copy of a slice from a DataFrame"

in the code:

# Setup
import os

import pandas as pd

from utils import make_chaos

from tests.helpers import check

pd.options.display.max_rows = 10
users = pd.read_csv(os.path.join('data', 'users.csv'), index_col=0)
# Pay no attention to the person behind the curtain
make_chaos(users, 19, ['first_name'], lambda val: val.lower())



## CHALLENGE - Verified email list ##

# TODO: Narrow list to those that have email verified.
users2 = users[(users.email_verified == True)]

#  The only columns should be first, last and email
email_list = users[:]

users3 = users2[['first_name','last_name','email']]


# TODO: Remove any rows missing last names

users3.dropna(subset = ["last_name"], inplace=True)


users4 = users3.dropna(subset = ["last_name"])



# # TODO: Ensure that the first names are the proper case
users4.loc[users4.first_name.str.islower(), 'first_name'] = users4.first_name.str.title()

# Return the new sorted DataFrame..last name then first name ascending
users4.sort_values(['last_name','first_name'])

I get this error with

users3.dropna(subset = ["last_name"], inplace=True)

and

users4.loc[users4.first_name.str.islower(), 'first_name'] = users4.first_name.str.title()

In the first case I am using inplace which I thought should remedy this issue. In the second case I closely followed the s2n08-manipulating-text.ipynb example. I would greatly appreciate if someone can explain what I am doing wrong.

2 Answers

Malte Niepel
Malte Niepel
19,212 Points

You are trying to set new values on a view (users3) of users2. Essentially what fixed the issue was to create a copy of the dataframe.

users3 = users2[['first_name','last_name','email']].copy()

This is my code and it took me a while to figure out why yours threw the error. But in my code example I wasn't using the view to change anything. It comes at the end only to be sorted before I display it.

## CHALLENGE - Verified email list ##

# TODO: Narrow list to those that have email verified.
#  The only columns should be first, last and email
email_list = users[:]
email_list = email_list[email_list.email_verified == True]

# TODO: Remove any rows missing last names
email_list.dropna(how='any', inplace=True)

# TODO: Ensure that the first names are the proper case
email_list.loc[email_list.first_name.str.islower(), 'first_name'] = email_list.first_name.str.title()
len(email_list[email_list.first_name.str.islower()])

# Return the new sorted DataFrame..last name then first name ascending
email_list2 = email_list[['first_name', 'last_name', 'email']]
email_list2.sort_values(by=['last_name', 'last_name'])
## CHALLENGE - Verified email list ##

# TODO: Narrow list to those that have email verified.
verified_emails = users["email_verified"] == True
#  The only columns should be first, last and email
email_list = users.loc[verified_emails, ["first_name","last_name","email_verified"]]

# TODO: Remove any rows missing last names
email_list.dropna(inplace=True)
# TODO: Ensure that the first names are the proper case
email_list.loc[email_list.first_name.str.islower(), "first_name"] = email_list.first_name.str.title()
email_list.loc[email_list.last_name.str.islower(), "last_name"] = email_list.last_name.str.title()

# Return the new sorted DataFrame..last name then first name ascending
email_list.sort_values(["last_name", "first_name", "email_verified"], inplace=True)

#I did email_list.dropna() because the only empty rows were last names```