Skip to content

Misc.Get_Pomeroy_Ratings() returning incorrect names in the 'Team' field #41

@Tim-Romer

Description

@Tim-Romer

Hi Team,

I found an issue when pulling the Ratings table by calling the misc.get_pomeroy_ratings(browser) method. It seems that some of the team names are being cut off. I noticed that team names with > 1 whitespace (ex. Cal St. Northridge) and team names that contain characters that aren't "A-Z" or "." (ex. Saint Mary's, Texas A&M). Below are a few of the problematic team names I found:

Correct Team Name Team Name from data pull
Saint Mary's Saint Mary
San Diego St. San Diego
Sam Houston St. Sam Houston
Texas A&M Texas A
etc. etc.

I found this issue that has been closed and it seems to be related: #9

I believe the issue has to do with the regex in misc.py line 35:
tmp = ratings_df['Team'].str.extract(r'(?P<Team>[a-zA-Z.]+\s*[a-zA-Z]+\.*)\s*(?P<Seed>\d*)')

Let me know what y'all think. Happy to provide a more complete list of team names that are not correct if needed.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions