-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing Data Wrangling with Polars #26
base: main
Are you sure you want to change the base?
Introducing Data Wrangling with Polars #26
Conversation
Just explored the .py file via the open-notebooks feature and it looks great! Was going to mention to include polars in the inline dependencies (start of the notebook) as suggested in our README instructions but it isn't required as the library isn't imported actually in the file. Thanks for kicking off the Polars series! |
@Haleshot Unfortunately I am not able to view my notebook via the open-notebooks. I am trying with the following URL: https://marimo.app/github.com/koushikkhan/learn/blob/feat/issue%2318/polars-data-wrangling/polars/01_why_polars.py. Definitely I am missing something here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the PR! This is a great start. The content is good, and makes me excited for the rest of the notebooks.
I have left some comments and suggestions. High level feedback:
- Let's aim to be concise, and avoid repeating ourselves often.
- Let's get to code examples sooner rather than later.
- Notebooks are meant to be presented in edit mode, so there is no need to duplicate the code in the markdown.
- Let's make sure the basics are explained explicitly, such as giving an example of a DataFrame early on.
- Let's tone the language down a little bit — the style should be educational and informative.
Again, great start and thanks so much!
|
||
df_pd = pd.DataFrame( | ||
{ | ||
"Gender": ["Male", "Female", "Male", "Female", "Male", "Female", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for consistency throughout future notebooks, let's use lowercase keys: "gender", "height_cm".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, will continue with lower case letter while defining column names.
@akshayka I will work on the change requests. |
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
…braries Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
Co-authored-by: Akshay Agrawal <[email protected]>
|
||
Like Pandas and PySpark, the central data structure in Polars is **the DataFrame**, a tabular data structure consisting of named columns. For example, the next cell constructs a DataFrame that records the gender, age, and height in centimeters for a number of individuals. | ||
|
||
<INSERT CODE CELL> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggestion was to split this markdown into two cells, and insert a block of Python in between that creates a DataFrame (for example, the gender, age, and height dataframe) which is reused in subsequent cells. Perhaps
import polars as pl
df_pl = pl.DataFrame(
{
"gender": ["Male", "Female", "Male", "Female", "Male", "Female",
"Male", "Female", "Male", "Female"],
"age": [13, 15, 17, 19, 21, 23, 25, 27, 29, 31],
"height_cm": [150.0, 170.0, 146.5, 142.0, 155.0, 165.0, 170.8, 130.0, 132.5, 162.0]
}
)
df_pl
@koushikkhan would you like any help finishing this PR? |
📝 Summary
This PR is introducing a course on data wrangling with Polars using marimo notebooks. It is linked to the following issue #18 .
📋 Checklist
--sandbox
README.md