Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access analysis (tools) #43

Open
3 tasks
kikuomax opened this issue Oct 11, 2022 · 5 comments
Open
3 tasks

Access analysis (tools) #43

kikuomax opened this issue Oct 11, 2022 · 5 comments
Assignees

Comments

@kikuomax
Copy link
Member

Develop tools to analyze access logs on top of the data warehouse.

  • Within a specific time period,
    • Where did the traffic to my site come from?
    • How often did the traffic from a specific origin (referer) happen?
    • How many attentions did a specific page on my site get?

Originally posted by @kikuomax in #30 (comment)

@kikuomax
Copy link
Member Author

We should experiment analysis queries before the free trial credits expire.

@kikuomax kikuomax self-assigned this Oct 19, 2022
@kikuomax
Copy link
Member Author

kikuomax commented Dec 18, 2022

Findings from preliminary analysis,

  • Most of the visitors without the referer
  • A lot of internal traffic
  • A lot of suspicious visitors trying to exploit misconfigurations

@kikuomax
Copy link
Member Author

Since running queries on Redshift is expensive, I am going to save analysis results somewhere. Where am I going to store analysis results?

  • In CSV files in an S3 bucket?
  • In DynamoDB table?

@kikuomax
Copy link
Member Author

DynamoDB may be preferable because it can export items to CSV files.

@kikuomax
Copy link
Member Author

DynamoDB may be preferable because it can export items to CSV files.

But we still have to transform data to get desired representations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant