-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Micro task] Explore NYC taxi trips dataset #6
Comments
Hi, i was approved for the initial stage of Outreachy. i noticed the datasets are in parquet format. I need some clarity and guidance, can bokeh read the parquet files directly? I was able to read them using pandas.pd. @pavithraes |
Hello @pavithraes |
Hello @pavithraes , I am exploring this data for the project Create a Blog Post Series - " Fundamentals of Data visualization in Bokeh ". I will use some python libraries to summarize and analyze the data after Performing tasks like Data Wrangling and Data processing to visualize it as per project requirements and then I'll use it on project. |
Hi @robinokwanma, open the link to the dataset website, scroll down, and you will see a hyperlink; "Working with PARAQUET format" right under the "Data Dictionary and MetaData" subtitle. There are details on how to work with the format in there and full details in the "trip record user guide". |
Thank you @AnishereMariam . I'm taking a look now |
Hi @pavithraes @AnishereMariam That's the link to my githubgist on this microtask. Please review |
@pavithraes started with this https://gist.github.com/Soot3/9eaf170fa2048e373e05046222350f54 |
@AnishereMariam thank you for answering the question @robinokwanma, I was about to ask the same question. |
@pavithraes I realized that the dataset was done on monthly basis, can someone download more than one month's dataset for the exploration? |
@robinokwanma, I opened the file and noticed that although the code seems fine, some variables are wrongly placed. Please fix that. @oluwaseun-tech, you are most welcome. Each month has over a million rows of data, if you are sure you can handle multiple months, it's great, but I suggest you use a subset of the data. That is just my opinion. |
Thank's i have made the changes. |
Oh! Okay thank you |
@pavithraes please take a look at what have done so far https://gist.github.com/oluwaseun-tech/ef413dd9658b2123bfc7240652bae90b |
Hello @pavithraes @AnishereMariam https://github.com/BhaswatiRoy/Data-Analysis-Projects/tree/main/Bokeh_Plots |
I had this similar problem. You can use pd.read_parquet to load the dataset. |
Hi @BhaswatiRoy, your choice of visualizations is really cool. |
Hi @JoyclynUjunwaOgbonna, have you been able to solve that via the solutions I suggested earlier? |
thanks @AnishereMariam for the feedback, I am on my way to adding more visualizations! |
That is perfect @BhaswatiRoy |
@pavithraes, the link to my work on NYC Data Exploration on GitHub gist is below: |
Hello @pavithraes, @bryevdv, everyone. |
@Faith-Nchifor the link to your notebook is showing a 404 error-"I can't find this page". This usually happens when your kaggle notebook is on private. Could you check if your notebook is on private? If it is, you might want to make it public so people can access it. |
I'm sorry about that @JoyclynUjunwaOgbonna . It's now public |
@Faith-Nchifor, I think it is the method you used. The chart followed the irregular fitting of the index. Would you consider using the |
@AnishereMariam your method is good. I realized that my map behaved the way it did because I never sorted the data. It looks just like this one now. Thanks for your input |
Hello @bryevdv, @pavithraes |
Hi @Faith-Nchifor well done |
Hi @BhaswatiRoy nice analysis and you choice of visualization is really great |
If you are having any challenges regarding the project, ask on this channel. I will be of great help to assist anyone |
thanks @Ajoke23 for the reviews |
Hello, @pavithraes @AnishereMariam please take a look at my first assignment on the analysis of NYC Taxi data on Jupyter Notebook. |
Hello everyone ,my name is Anushka Sharma and I have made my contribution in #1 project I have tried to work with two different datasets first one is TLC Driver 24 hour course and second one is yellow taxi dataset for the month oct and nov . Also for the reference , have attached a pdf containing my outputs and other relevant data as well .I am contributing to a project for the first time . I appreciate any reviews and comment on it. |
#6 (comment) |
Hi, here is my submission for the microtask on the project, Create a blog post series: "Fundamentals of Data Visualization in Bokeh." Attached in a separate images folder are the plots that were generated inline. For some reason, they do not appear inline in the notebook here on github. |
To show plot: |
@Isaakkamau, that is fine work. keep up the good work. |
@AnishereMariam thanks a lot, but how many visualizations are we supposed to have? I decided first to do one then I can add others if it's needed |
Hello @pavithraes @AnishereMariam Please find my contribution for task 1 here I added the visualizations as comment since github gist cant render it from my notebook. |
Well done @Azaya89, you did a great work |
You did a great work. Well done @PatChizzy. Unique and creative visualization |
Hi all thanks for the submissions so far! This is our first time doing outreachy so this is a learning experience for us as well! One thing that has become apparent is that it is a bit confusing and difficult to provide individualized comments when all the submissions are mixed together in one place like this! I'd like to ask everyone who has submitted here to open a new issue that has any relevant links, images, etc for your work. This will allow us to have 1-1 conversations with everyone on their own issue :) |
For those that might been having issue figuring it out you can follow this steps. That's all. |
I think the issue here is not the code written. What I've been able to figure out is that using I hope this helps. |
This works quite alright in my python environment. However, when the notebook has been downloaded, the images do no show |
Okay @Azaya89. I'm gonna try it out. Thanks |
You're welcome. |
The New York City TLC taxi trips records data is frequently used for creating examples and tutorials for Python data science workflows. You can access the dataset through any of the following ways:
Note that the actual dataset is quite large, so please use a subset of the data or consider reducing it.
To complete this micro-task, download and explore a subset of the dataset with Bokeh plots. You can share your Jupyter Notebooks with us as a GitHub gist. As per Bryan's comment here, please open separate issues/PRs with your wok, so that we can share feedback individually.
The text was updated successfully, but these errors were encountered: