-
Notifications
You must be signed in to change notification settings - Fork 0
/
Textual_Content.txt
113 lines (91 loc) · 6.03 KB
/
Textual_Content.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
PHISHr
Phish the Phisher before they phish you!!!
Phishing attacks are a common type of cyber attack where malicious actors attempt to deceive individuals or organizations into revealing sensitive information such as usernames, passwords, credit card numbers, or other personal or financial data. These attacks typically involve impersonating a trusted entity, such as a bank, a government agency, a company, or even a colleague or friend.
Phishing attacks pose significant risks to individuals, businesses, and organizations. They can lead to identity theft, financial loss, data breaches, and reputational damage. To protect against phishing attacks, it's essential to stay vigilant, be cautious of unsolicited emails or messages, verify the authenticity of websites and communications, and regularly update security measures such as antivirus software and firewalls. Additionally, education and awareness training for employees and users are crucial in preventing successful phishing attacks.
How you can save yourself from these types of phishing attacks?
Awareness and participation insecurity is the only trick you can use to save yourself to get caught in these attack. whenever you receive any Emails, SMS or anything unidentified don’t reply to them, don’t pick any calls from unknown numbers, don’t provide your personal data like your account details, credit card details, any mail id, passwords, or any information which is useful to you. Awareness of these types of phishing attacks and participation in the security of the company or in an organization should be mandatory. This is the only way that you can save yourself.
import streamlit as st
# Title and Description
st.title("End-to-End Machine Learning Project: Phishing Website Detection")
st.write("This is an End-to-End Machine Learning Project which focuses on phishing websites to classify phishing and legitimate ones. Particularly, We focused on content-based features like html tag based features. You can find feature extraction, data collection, preparation process here. Also, building ML models, evaluating them are available here.")
# Inputs
st.markdown("<hr>", unsafe_allow_html=True)
st.subheader("Inputs")
st.write("CSV files of phishing and legitimate URLs:")
st.write("- verified_online.csv: phishing websites URLs from phishtank.org")
st.write("- tranco_list.csv: legitimate websites URLs from tranco-list.eu")
# General Flow
st.markdown("<hr>", unsafe_allow_html=True)
st.subheader("General Flow")
st.write("1. Use csv file to get URLs")
st.write("2. Send a request to each URL and receive a response by requests library of python")
st.write("3. Use the content of response and parse it by BeautifulSoup module")
st.write("4. Extract features and create a vector which contains numerical values for each feature")
st.write("5. Repeat feature extraction process for all content\websites and create a structured dataframe")
st.write("6. Add label at the end to the dataframes | 1 for phishing 0 for legitimate")
st.write("7. Save the dataframe as csv and structured_data files are ready!")
st.write("8. Check 'structured_data_legitimate.csv' and 'structured_data_phishing.csv' files.")
st.write("9. After obtaining structured data, you can use combine them and use them as train and test data")
st.write("10. You can split data as train and test like in the machine_learning.py first part, or you can implement K-fold cross-validation like in the second part of the same file. We implemented K-fold as K=5.")
st.write("11. Then We implemented five different ML models:")
st.write(" - Support Vector Machine")
st.write(" - Gaussian Naive Bayes")
st.write(" - Decision Tree")
st.write(" - Random Forest")
st.write(" - AdaBoost")
st.write("12. You can obtain the confusion matrix, and performance measures: accuracy, precision, recall")
st.write("13. Finally, We visualized the performance measures for all models.")
st.write("14. Naive Bayes is the best for my case.")
# Important Notes
st.markdown("<hr>", unsafe_allow_html=True)
st.subheader("Important Notes")
st.write("Features are content-based and need BeautifulSoup module's methods and fields etc So, you should install it.")
# Dataset
st.markdown("<hr>", unsafe_allow_html=True)
st.subheader("Dataset")
st.write("With your URL list, you can create your own dataset by using data_collector python file.")
def streamlit_menu(example=1):
if example == 1:
# 1. as sidebar menu
with st.sidebar:
selected = option_menu(
menu_title="Main Menu", # required
options=["Home", "Projects", "Contact"], # required
icons=["house", "book", "envelope"], # optional
menu_icon="cast", # optional
default_index=0, # optional
)
return selected
if example == 2:
# 2. horizontal menu w/o custom style
selected = option_menu(
menu_title=None, # required
options=["Home", "Projects", "Contact"], # required
icons=["house", "book", "envelope"], # optional
menu_icon="cast", # optional
default_index=0, # optional
orientation="horizontal",
)
return selected
if example == 3:
# 2. horizontal menu with custom style
selected = option_menu(
menu_title=None, # required
options=["Home", "Projects", "Contact"], # required
icons=["house", "book", "envelope"], # optional
menu_icon="cast", # optional
default_index=0, # optional
orientation="horizontal",
styles={
"container": {"padding": "0!important", "background-color": "#fafafa"},
"icon": {"color": "orange", "font-size": "25px"},
"nav-link": {
"font-size": "25px",
"text-align": "left",
"margin": "0px",
"--hover-color": "#eee",
},
"nav-link-selected": {"background-color": "green"},
},
)
return selected