API review
Proposer: Jack O'Quin
Present at review:
- Daniel Stonier
- Piyush Khandelwal
- Jihoon Lee
Proposed new scheduler package
This is a first draft API for a simple_scheduler package for the rocon_concert metapackage.
Goals
- This API will support multiple implementations so we can conveniently compare different scheduling algorithms.
- It provides an interface for ROCON services to request resources, like robots, devices or strictly software processes for their own private use.
- It provides an interface where ROCON services can request and make use of a resource that is shared across services.
- Resources previously allocated to a ROCON service may sometimes need to be taken away and given to other higher-priority services.
Assumptions
Environs
- The scheduler runs as a ROS node on the same master as the ROCON conductor, the ROCON services and other solution components.
- The ROCON conductor provides the scheduler with up to date information about robot availability.
Requester
- The scheduler provides a collection of ROS topics for requesting resources:
- If the required resources are not immediately available the request will be queued, and may not complete for quite a long time.
- The requesting node will be notified when it is waiting via a feedback topic.
- The requesting service may cancel its queued request, if it no longer wishes to wait.
- A request can be made for a single resource, or a batched set of resources.
Preemption
- The scheduler and the ROCON services cooperate in an effort to handle resources sensibly.
- When a resource needs to be pre-empted, the scheduler requests the original owner to release it cleanly.
- If the owner does not or cannot comply within a reasonable time, the scheduler asks the conductor to terminate those connections.
The basis for pre-emption decision making is via judicious use of (dynamically adjustable) priorities flagged in each request.
Resources
- The resources being allocated are relatively substantial.
- They may remain in use for a long time.
- Frequent allocation and deallocation are not the normal usage.
- The resources being allocated are unreliable and inconstant.
- They may leave the concert of their own volition (clean exit).
- They may be out of wireless range, but still officially connected.
- They may be still registered, but not contactable (e.g. system crashed, battery out) - symptom same as for being out of wireless range.
- They can represent software nodes which consume computational units
- They may make themselves available for a single service, or optionally can be shared over multiple requests (e.g. map database, camera, or software algorithm node).
Links & References
Experimental sources and documentation:
https://github.com/jack-oquin/rocon_msgs/tree/hydro-devel/scheduler_msgs
http://farnsworth.csres.utexas.edu/docs/rocon_scheduler_requests/html/index.html
Meeting Notes:
Hangout Notes - 3rd December - crystallising direction via listing of the most important use cases.
Discussion Threads
Flag items that are still pending with a /!\.
Interface Mechanisms - mix of pubs/sub topics
Priorities - most scheduling algorithms have a concept of priorities
- Responsibility of the service requester to set priorities.
Requests - shaping what a request should look like.
- The resource is a rocon resource identifier for a rapp.
- Handle requests for batched resources by specifying a resource list.
- Priority flags as a signed int16 to provide flexibility.
- Platform info used as a hint/filter for a resource to the scheduler.
- Remappings must be included for a resource to start apps.
Starting Rapps - whose responsibility is it to start rapps?
- The scheduler will start and stop rapps, keeping track of where they are assigned.
Package Name - what should this component be called?
rocon_scheduler_requests
Moved to future concerns:
Software Resources - software applications and robots with equal citizenship status as schedulable resources.
- Don't worry for now, start inside private software services and tackle this problem later when we try building a concert farm for managing software nodes.
Sharing Resources - sharing a resource across multiple concert services.
- Workaround in rapps for now, but add this responsibility to the scheduler, need time to design properly though.
Release Plan - when and where should we release?
- Igloo (~Apr '14)
Optional Requests - optional resource flagging in a batched resource request.
- Do with requests for batched resources, but
- Pending a later iteration (Jan+).
Future Reservations - reserving resources for use 2hrs from now.
- Esoteric case, leave for future development.