Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Enhanced Data Collection and Response Time for SendWork and Twitter #471

Merged
merged 38 commits into from
Aug 2, 2024

Conversation

jdutchak
Copy link
Contributor

@jdutchak jdutchak commented Jul 31, 2024

Description

✓ - denotes this task complete

This PR fixes # running tests for 429 and bad auth errors fixing timing for work responses to collector

Key Product Enhancements

  • Improved Twitter scraping timeout and performance - ✓
  • Implemented error handling in tweet collection - ✓
  • Upgraded sentiment analysis to work updated tweet data structures - ✓
  • Optimized worker selection for faster and more efficient data processing - ✓
  • Implemented reduced timeout handling to have faster response times - ✓

Performance Metrics

  • Worker Selection:

    • Increased from selecting all available nodes to a maximum of XX nodes per request - ✓
    • Node selection now considers worker category (Twitter, Telegram, Discord, Web) for more targeted task distribution - ✓
  • Timeout Handling:

    • Reduced global timeout from 60 seconds to XX tbd seconds for faster response times- ✓
    • Implemented per-node timeout checks to identify and temporarily exclude underperforming nodes - ✓
  • Response Collection:

    • Now collecting responses from all workers within the timeout period, instead of stopping at the first response - ✓

Breaking Changes

  • List any breaking changes here...
  • None

Next Steps @jdutchak to document here

  • Fine-tune worker selection algorithm based on performance data
  • Implement adaptive timeouts based on historical node performance

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

@jdutchak jdutchak self-assigned this Jul 31, 2024
@jdutchak jdutchak added the DO NOT MERGE Something will break if you do label Jul 31, 2024
@teslashibe teslashibe changed the title fix: tweet errors fix: Enhanced Data Collection and Response Time for SendWork and Twitter Aug 1, 2024
pkg/api/handlers_data.go Outdated Show resolved Hide resolved
@teslashibe
Copy link
Contributor

teslashibe commented Aug 1, 2024

@jdutchak I added some changes to the timeouts and have commented in my review. When I run with my local node setup as a twitter scraper I only timeout even though my node should return a response in the n=3 nodes.

Ideally in this case its should at minimum return the response from node node even if the other two nodes are unavailable to do the work. Or in the other case where my node cannot do the work the other three nodes in nodeData then return the work.

INFO[0017] [+] Sending work to network                  
INFO[0017] [+] Actor started                            
INFO[0017] [+] Worker Address: 5.180.148.225            
INFO[0017] [+] Worker Address: 194.60.201.42            
INFO[0024] [+] worker tick                              
[GIN] 2024/08/01 - 11:43:23 | 504 | 10.000647334s |             ::1 | GET      "/api/v1/data/twitter/profile/brendanplayford"
INFO[0039] [+] worker tick                              
INFO[0049] [+] Peer added to DHT: 16Uiu2HAm7bHeCLgb7AD9rEyuHGzWdMya8qkPzciw1LxBbGcXz26S 
INFO[0054] [+] worker tick                              
INFO[0068] [+] Routing table size: 32                   
INFO[0069] [+] blockchain tick                          
INFO[0069] [+] worker tick                              
INFO[0084] [+] worker tick                              
INFO[0099] [+] worker tick                              
INFO[0114] [+] worker tick                              
INFO[0128] [+] Routing table size: 32                   
INFO[0129] [+] worker tick                              
INFO[0144] [+] worker tick                              
INFO[0159] [+] worker tick                              
INFO[0174] [+] worker tick                              
INFO[0188] [+] Routing table size: 32                   
INFO[0189] [+] worker tick                              
INFO[0204] [+] worker tick                              
INFO[0219] [+] worker tick                              
INFO[0234] [+] worker tick                              
INFO[0248] [+] Routing table size: 32                   
INFO[0249] [+] worker tick                              
INFO[0258] [+] Connected                                 Peer=16Uiu2HAmV4jCKqTcUY2LHC3WT9kSBp7kAfmJ4vHZxB5En4A4whF5 conn="<swarm.Conn[*libp2pquic.transport] /ip4/0.0.0.0/udp/4001/quic-v1 (16Uiu2HAm9GNsYsuXkLGM8r7bzoh2nFxkocvuUi9oFeaMUqdVPTB4) <-> /ip4/89.58.14.43/udp/4001/quic-v1 (16Uiu2HAmV4jCKqTcUY2LHC3WT9kSBp7kAfmJ4vHZxB5En4A4whF5)>" network="<Swarm 16Uiu2HAm9GNsYsuXkLGM8r7bzoh2nFxkocvuUi9oFeaMUqdVPTB4>"
INFO[0264] [+] worker tick                              

@teslashibe
Copy link
Contributor

teslashibe commented Aug 1, 2024

Hey @jdutchak quick note here when we catch errors from other nodes, suggest we add in the PeerID of the node from which the error is returning from and differentiate between local and remote errors if we can in this PR. Then a user knows if an error is related to their local node or if its another remote node on the network

ERRO[0107] [-] Error processing request: there was an error authenticating with your Twitter credentials 
INFO[0107] [+] Actor stopping                           
INFO[0107] [+] Actor stopped                            
INFO[0107] [-] Peer removed from DHT: 16Uiu2HAmV4jCKqTcUY2LHC3WT9kSBp7kAfmJ4vHZxB5En4A4whF5 
INFO[0108] [+] Peer added to DHT: 16Uiu2HAkzF8cmqJDg3eeTYACydCL8UQLf9f86UNuYfVtv23PayXV 
  1. Case where a remote node has an error
This would become: ERRO[0107] [-] Remote node {PEERID}: Error processing request: there was an error authenticating with your Twitter credentials 
  1. Case where a local node has an error
This would become: ERRO[0107] [-] Your node {PEERID}: Error processing request: there was an error authenticating with your Twitter credentials. Please check your configuration. 

@teslashibe
Copy link
Contributor

Looks like with a make clean and make run I am now getting a response

INFO[2674] [-] Peer removed from DHT: 16Uiu2HAmKULCxKgiQn1EcfKnq1Qam6psYLDTM99XsZFhr57wLadF 
INFO[2675] [+] Peer added to DHT: 16Uiu2HAm17obtAHet7YkoPH1vcsteBYFVmNJq62gGEJ5xxSu5BAk 
INFO[2678] [+] worker tick                              
INFO[2680] [-] Peer removed from DHT: 16Uiu2HAmNKxVbi6egvWy5ifxGXbfT4UEnURvkRmExA2dLnFsCcbv 
INFO[2682] Node left: /ip4/194.233.92.45/udp/4001/quic-v1/p2p/16Uiu2HAmSLNUCgq42t2HwG5NLGDyQGQhJLz3TDWe7P1GZUtWbhYy/p2p/16Uiu2HAmSLNUCgq42t2HwG5NLGDyQGQhJLz3TDWe7P1GZUtWbhYy 
INFO[2682] [+] Staked node joined: /ip4/194.233.92.45/udp/4001/quic-v1/p2p/16Uiu2HAmSLNUCgq42t2HwG5NLGDyQGQhJLz3TDWe7P1GZUtWbhYy/p2p/16Uiu2HAmSLNUCgq42t2HwG5NLGDyQGQhJLz3TDWe7P1GZUtWbhYy 
INFO[2683] [+] Peer added to DHT: 16Uiu2HAmNKxVbi6egvWy5ifxGXbfT4UEnURvkRmExA2dLnFsCcbv 
INFO[2693] [+] worker tick                              
INFO[2704] [+] Connected                                 Peer=16Uiu2HAmMxU66HfQ2mpQYXK21wuWHzbeGBdYNMyr74FKsPqqngG6 conn="<swarm.Conn[*libp2pquic.transport] /ip4/0.0.0.0/udp/4001/quic-v1 (16Uiu2HAm9GNsYsuXkLGM8r7bzoh2nFxkocvuUi9oFeaMUqdVPTB4) <-> /ip4/185.207.107.117/udp/4001/quic-v1 (16Uiu2HAmMxU66HfQ2mpQYXK21wuWHzbeGBdYNMyr74FKsPqqngG6)>" network="<Swarm 16Uiu2HAm9GNsYsuXkLGM8r7bzoh2nFxkocvuUi9oFeaMUqdVPTB4>"
INFO[2704] [+] Connected                                 Peer=16Uiu2HAmV4jCKqTcUY2LHC3WT9kSBp7kAfmJ4vHZxB5En4A4whF5 conn="<swarm.Conn[*libp2pquic.transport] /ip4/0.0.0.0/udp/4001/quic-v1 (16Uiu2HAm9GNsYsuXkLGM8r7bzoh2nFxkocvuUi9oFeaMUqdVPTB4) <-> /ip4/89.58.14.43/udp/4001/quic-v1 (16Uiu2HAmV4jCKqTcUY2LHC3WT9kSBp7kAfmJ4vHZxB5En4A4whF5)>" network="<Swarm 16Uiu2HAm9GNsYsuXkLGM8r7bzoh2nFxkocvuUi9oFeaMUqdVPTB4>"
INFO[2704] [+] Connected                                 Peer=16Uiu2HAmSLNUCgq42t2HwG5NLGDyQGQhJLz3TDWe7P1GZUtWbhYy conn="<swarm.Conn[*libp2pquic.transport] /ip4/0.0.0.0/udp/4001/quic-v1 (16Uiu2HAm9GNsYsuXkLGM8r7bzoh2nFxkocvuUi9oFeaMUqdVPTB4) <-> /ip4/194.233.92.45/udp/4001/quic-v1 (16Uiu2HAmSLNUCgq42t2HwG5NLGDyQGQhJLz3TDWe7P1GZUtWbhYy)>" network="<Swarm 16Uiu2HAm9GNsYsuXkLGM8r7bzoh2nFxkocvuUi9oFeaMUqdVPTB4>"
INFO[2705] [+] Sending 1 node data records to 16Uiu2HAmSLNUCgq42t2HwG5NLGDyQGQhJLz3TDWe7P1GZUtWbhYy 
INFO[2707] [+] Peer added to DHT: 16Uiu2HAm67cJYWEv3UVprb48cRkg37Dqxroaqjq9yV2r6gPFobhH 
INFO[2708] [+] Peer added to DHT: 16Uiu2HAmVFYu1ui1vbtZNVWeoZGHER7ZsAcDxJWpX5WMAK5hoZGD 
INFO[2708] [+] worker tick                              
INFO[2709] [+] Sending work to network                  
INFO[2709] [+] Actor started                            
INFO[2709] [+] Worker Address: 194.233.92.45            
INFO[2711] [+] Peer added to DHT: 16Uiu2HAm7bHeCLgb7AD9rEyuHGzWdMya8qkPzciw1LxBbGcXz26S 
INFO[2712] [+] blockchain tick                          
INFO[2713] [+] Routing table size: 35                   
INFO[2717] [+] Work done bafkreig2om3reuayqvibrhudxokwogmvzc6fpah6borepgsgb7zfn4xq24 
INFO[2717] [+] Publishing work event : bafkreig2om3reuayqvibrhudxokwogmvzc6fpah6borepgsgb7zfn4xq24 for Peer 16Uiu2HAm9GNsYsuXkLGM8r7bzoh2nFxkocvuUi9oFeaMUqdVPTB4 
[GIN] 2024/08/01 - 13:00:58 | 200 |   8.00158875s |             ::1 | GET      "/api/v1/data/twitter/profile/brendanplayford"
INFO[2723] [+] worker tick  

pkg/workers/workers.go Dismissed Show dismissed Hide dismissed
pkg/pubsub/node_event_tracker.go Fixed Show fixed Hide fixed
@jdutchak
Copy link
Contributor Author

jdutchak commented Aug 1, 2024

@teslashibe @restevens402

Implement adaptive timeouts based on historical node performance

  • adding the WorkerTimeout parameter to NodeData and implementing the necessary checks
WorkerTimeout        time.Time       `json:"workerTimeout,omitempty"`
  • modified the SendWork function to include the WorkerTimeout check
 // Check WorkerTimeout
nodeData := node.NodeTracker.GetNodeData(p.PeerId.String())
if !nodeData.WorkerTimeout.IsZero() && time.Since(nodeData.WorkerTimeout) < 60*time.Minute {
logrus.Infof("[+] Skipping worker %s due to timeout", p.PeerId)
continue
}
  • update the WorkerTimeout when an error is received
// Set WorkerTimeout for the node
nodeData := node.NodeTracker.GetNodeData(data.ReceivedFrom.String())
if nodeData != nil {
nodeData.WorkerTimeout = time.Now()
node.NodeTracker.AddOrUpdateNodeData(nodeData, true)
}
  • added the following function to clear expired WorkerTimeouts
func (net *NodeEventTracker) ClearExpiredWorkerTimeouts() {

@jdutchak jdutchak removed the DO NOT MERGE Something will break if you do label Aug 2, 2024
restevens402
restevens402 previously approved these changes Aug 2, 2024
Copy link
Contributor

@restevens402 restevens402 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good. No issues for me.

restevens402
restevens402 previously approved these changes Aug 2, 2024
Copy link
Contributor

@restevens402 restevens402 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great last catch. Happy you found it!

5u6r054
5u6r054 previously approved these changes Aug 2, 2024
@jdutchak jdutchak merged commit 5aa99ab into test Aug 2, 2024
11 checks passed
@jdutchak jdutchak deleted the fix/tweet-errors branch August 2, 2024 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants