[LAS-395] Monitoring not working, if not started in first place Created: 03/Feb/17  Updated: 15/Mar/19  Resolved: 15/Mar/19

Status: Closed
Project: las2peer
Component/s: Core, las2peer Monitoring
Affects Version/s: v0.7.5
Fix Version/s: v0.8.0

Type: Bug Priority: Major
Reporter: Thomas Cujé Assignee: Philipp Hossner
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Fix Release Date: 13/Mar/19


Currently the las2peer monitoring needs to be started in first place to work properly.

This constraint is not acceptable for productive deployments, since a network generally can not be restarted in total.

Furthermore a monitoring observer may want to connect to an already running network.

Comment by Thomas Cujé [ 10/Feb/17 ]

After the meeting today we came up with the idea to store the monitoring data inside the network.

Is this an option? Dominik RenzelPeter de Lange

Comment by Dominik Renzel [ 10/Feb/17 ]

IMHO not without a performant solution for running queries on the data. It's not about having the data, but also to make sense of it in reasonable fusions and aggregations. This would most likely be a master thesis of its own... Yet an interesting one...

Comment by Peter de Lange [ 13/Feb/17 ]

I have nothing against rethinking technical "details" of the monitoring concept, but I want to raise a more general point here:
The monitoring isn't used currently. We only use success models for nodes which were developed more than three years ago when for example there was a session based access (via the http connector) and things like "Agent Upload Successful" was a measure reflecting the node's stability. For services, we don't have a single working success model that tells me anything useful about a currently used service in our seed network. I think before we tackle any technical details here, we should focus on developing a "Monitoring Practice" that we actually use over a certain amount of time. This also includes actually reflecting on the data monitored and derive some insights from it. Before we do this, I think we should not touch any technical details. Starting the monitoring node first in our seed network might be far from optimal, but, practical speaking, it does not influence our current setting, does it?

So please, if we want to do something to the monitoring, let's use it properly first! This is an appeal to all of us, I am glad to help or discuss any success models you develop for your services!

Comment by Dominik Renzel [ 13/Feb/17 ]

Although Peter's comment is a slap into my face, it's well-justified... I totally back this. The only thing I can do for now is to leave a link to my dissertation, which should give you a better impression on both theoretical and practical thoughts behind MobSOS: http://publications.rwth-aachen.de/record/667644. However, I see that Ralf Klamma and Peter de Lange know well, in which direction my heritage should be driven in future.

Comment by Peter de Lange [ 13/Feb/17 ]

..just wanted to mention that it was not intended as a slap into your - or anyones - face. But if it has to be perceived as one, it should hit all of us - as the las2peer team - in a roundhouse kick manner. So please, to anyone of us, it was nothing against anyone in person. A community develops practices that all have to follow to evolve, right?

Comment by Dominik Renzel [ 13/Feb/17 ]

Peter de Lange, no hard feelings at all! Wake-up calls are always necessary!

Comment by Philipp Hossner [ 10/Mar/19 ]

I just opened a PR on Github that at least partially fixes the issue: https://github.com/rwth-acis/las2peer/pull/37

Short version: The MonitoringObserver now can handle availability issues of the monitoring service and starts to log again as soon as it becomes available. Messages logged during monitoring service downtime are still discarded though.

Comment by Peter de Lange [ 15/Mar/19 ]

Thank you very much. Let us close this issue for now. The constraint that monitoring without a monitoring service is not possible is ok I guess!

Generated at Mon Oct 14 09:12:24 CEST 2019 using JIRA 7.8.0#78000-sha1:4568b9d484113d74dfb6f152fb925b5fa1be2ef7.