Please ask about problems and questions regarding this tutorial on answers.ros.org. Don't forget to include in your question the link to this page, the versions of your OS & ROS, and also add appropriate tags. |
Introduction to using the SQL database interface
Description: This is a step-by-step introduction to using the SQL database interface. We will create a new class for holding our data, retrieve data from a database and write back modified fields.Tutorial Level: BEGINNER
Next Tutorial: The SQL database interface: advanced concepts
Contents
Creating the Database
First, let's create the database itself. We will create a database called "students" with two tables. The first table, called student will keep a record for each student. The second one, called grade will keep a record of grades for all students.
In general, a relational database provides many well-established ways of ensuring consistency in the data, which is one of the key reasons for using them. In particular, for this database:
student_id is a PRIMARY_KEY of the student table, ensuring every student get a unique id
student_majors is an array of strings (as a student can have multiple majors)
student_id is a foreign key in the 'grade' table, referencing the student table, ensuring any grade is assigned to a real student
the combination of student_id and grade_subject is UNIQUE in the grade table, ensuring that one student can only have a single grade in a given subject
each grade also gets a unique grade_id, acting as a PRIMARY_KEY in the grade table.
- in a real database, we might have taken additional steps to ensure data consistency, such as a separate table for subjects to ensure we don't misspell a subject name, etc. However, this is a toy example.
Starting the Package
All the code in this tutorial is provided as an example in the student_database package. Here we will rebuild it from scratch for illustration purposes.
We will now create a ROS package to make use of this database. This tutorial assumes you are familiar with the ROS build environment and the tools for creating and using packages.
Start your package:
roscreate-pkg student_database database_interface
Inside your package, create and edit the code file src/students.cpp. We will add all the code below to this file.
Creating C++ Containers (Classes) for the Data
This is the most significant overhead when using the database with our interface. Essentially, we are creating a C++ class that mirrors our data structure in the database. However, it is done only once, when you first establish your data structures - adding fields later is very fast. Furthermore, in most cases, after you are done with this step you can forget about the SQL behind the scenes.
The `Student` Class
We will first create a class to hold instances of the 'student' data. Inside the src/students.cpp file, add the following code:
#include <string> #include <vector> #include <database_interface/db_class.h> class Student : public database_interface::DBClass { public: database_interface::DBField<int> student_id_; database_interface::DBField<std::string> student_first_name_; database_interface::DBField<std::string> student_last_name_; database_interface::DBField< std::vector<std::string> > student_majors_; database_interface::DBField<double> student_gpa_; Student() : student_id_(database_interface::DBFieldBase::TEXT, this, "student_id", "student", true), student_first_name_(database_interface::DBFieldBase::TEXT, this, "student_first_name", "student", true), student_last_name_(database_interface::DBFieldBase::TEXT, this, "student_last_name", "student", true), student_majors_(database_interface::DBFieldBase::TEXT, this, "student_majors", "student", true), student_gpa_(database_interface::DBFieldBase::TEXT, this, "student_gpa", "student", true) { primary_key_field_ = &student_id_; fields_.push_back(&student_first_name_); fields_.push_back(&student_last_name_); fields_.push_back(&student_majors_); fields_.push_back(&student_gpa_); setAllFieldsReadFromDatabase(true); setAllFieldsWriteToDatabase(true); } };
The `Student` Class Explained
All database classes must inherit from database_interface::DBClass:
#include <database_interface/db_class.h> class Student : public database_interface::DBClass {
Fields are made public in this toy example for easy access, but you can treat them as you would any member data of your C++ classes. Key requirement: all fields that are to be stored in the database must be wrapped as database_interface::DBField<>, templated on the type of data they hold:
public: database_interface::DBField<int> student_id_; database_interface::DBField<std::string> student_first_name_; database_interface::DBField<std::string> student_last_name_; database_interface::DBField< std::vector<std::string> > student_majors_; database_interface::DBField<double> student_gpa_;
Note that is not necessary that the name of the variable (e.g. student_id_) be identical to the name of the column in the database (e.g. "student_id"). However, that is advisable to make the code easier to read. In this example, we have used the column names as member variable names, but also appended _ at the end, according to ROS coding standards.
Key requirement: all fields must be initialized in the constructor. A field constructor takes the following arguments:
the type of serialization used (TEXT for all fields in this toy example)
the owner of the field ( usually this, or the instance of the DBClass that owns that field)
- the name of the table column corresponding to that field
- the name of the table in which the field is stored
- whether it is allowed to modify en entry in the database using a reference to this field.
Student() : student_id_(database_interface::DBFieldBase::TEXT, this, "student_id", "student", true), student_first_name_(database_interface::DBFieldBase::TEXT, this, "student_first_name", "student", true), student_last_name_(database_interface::DBFieldBase::TEXT, this, "student_last_name", "student", true), student_majors_(database_interface::DBFieldBase::TEXT, this, "student_majors", "student", true), student_gpa_(database_interface::DBFieldBase::TEXT, this, "student_gpa", "student", true) {
Finally, one field must be a primary key. All instances of DBClass have a primary_key_field_ pointer, which must be set on construction. All other fields must go into the fields_ array of the DBClass:
primary_key_field_ = &student_id_; fields_.push_back(&student_first_name_); fields_.push_back(&student_last_name_); fields_.push_back(&student_majors_); fields_.push_back(&student_gpa_);
Optional: let all fields be read automatically when an instance of a student is retrieved from the database. Also, let all fields be written automatically when an instance of a student is saved the database. These options are useful if you have a very large field (e.g. a binary bitmap with the picture of the student) which you do not want retrieved automatically whenever you get a student info from the database:
setAllFieldsReadFromDatabase(true); setAllFieldsWriteToDatabase(true);
The `Grade` Class
Similarly, let's add the class definition for grade:
class Grade : public database_interface::DBClass { public: database_interface::DBField<int> grade_id_; database_interface::DBField<int> student_id_; database_interface::DBField<std::string> grade_subject_; database_interface::DBField<double> grade_grade_; Grade() : grade_id_(database_interface::DBFieldBase::TEXT, this, "grade_id", "grade", true), student_id_(database_interface::DBFieldBase::TEXT, this, "student_id", "grade", true), grade_subject_(database_interface::DBFieldBase::TEXT, this, "grade_subject", "grade", true), grade_grade_(database_interface::DBFieldBase::TEXT, this, "grade_grade", "grade", true) { primary_key_field_ = &grade_id_; fields_.push_back(&student_id_); fields_.push_back(&grade_subject_); fields_.push_back(&grade_grade_); setAllFieldsReadFromDatabase(true); setAllFieldsWriteToDatabase(true); } };
Connecting to the Database
Having designed our data containers, we will now connect to the database and start using the data:
#include <boost/shared_ptr.hpp> #include <database_interface/postgresql_database.h> int main(int argc, char **argv) { database_interface::PostgresqlDatabaseInterface database("wgs36.willowgarage.com", "5432", "willow", "willow", "students"); if (!database.isConnected()) { std::cerr << "Database failed to connect \n"; return -1; } std::cerr << "Database connected successfully \n"; }
The PostgreSQL database constructor takes the following arguments, needed for establishing a connection:
- the address of the PostgreSQL server ("wgs36.willowgarage.com")
- the port used for the connection (5432)
- the PostgreSQL username and password ("willow" and "willow")
- the name of the database to connect to ("students")
Using the database
Retrieving data
Complete instances of our data types are retrieved from the database as vectors of shared pointers:
std::vector< boost::shared_ptr<Student> > students; if (!database.getList(students)) { std::cerr << "Failed to get list of students\n"; return -1; } std::cerr << "Retrieved " << students.size() << " student(s) \n";
In general, all data manipulation functions will return true on success and false on failure, with the data itself returned through one of the arguments passed in by reference. For clarity, we might omit the check for success from now on in this tutorial.
Accessing data
Key requirement: any data wrapped in a DBField<> member of a DBClass instance can be accessed using the .data() function
std::cerr << "Students: \n"; for (size_t i=0; i<students.size(); i++) { std::cerr << students[i]->student_last_name_.data() << ", " << students[i]->student_first_name_.data() << ": \n"; std::cerr << " GPA: " << students[i]->student_gpa_.data() << "\n"; std::cerr << " Major(s): "; for (size_t j=0; j<students[i]->student_majors_.data().size(); j++) { if (j!=0) std::cerr << ", "; std::cerr << students[i]->student_majors_.data().at(j); } std::cerr << "\n"; }
Note that the .data() function returns a reference to the underlying data type, such as, in this example, std::vector<std::string>:
for (size_t j=0; j<students[i]->student_majors_.data().size(); j++) { if (j!=0) std::cerr << ", "; std::cerr << students[i]->student_majors_.data().at(j); }
Retrieving data using addition WHERE clauses
Let's get all the grades for the student John Smith. We need an addition filter in our query, for which we will use a where clause:
std::vector< boost::shared_ptr<Grade> > grades; std::string where_clause("student_id=1"); database.getList(grades, where_clause); std::cerr << "Student with id 1 has " << grades.size() << " grade(s) on record\n";
Efforts are currently under way to provide an easier and more general interface for where clauses. Note that in real examples, one might provide a convenience wrapper function, e.g. getGradesForStudent(int student_id), which would do the above legwork.
Modifying individual fields
There was some mistake, and one of the grades is wrong. Let's correct it:
grades[0]->grade_grade_.data() = 2.5; if (!database.saveToDatabase( &(grades[0]->grade_grade_) ) ) std::cerr << "Failed to modify grade\n"; else std::cerr << "Grade modified successfully\n";
Note that the saveToDatabase function writes the value of an individual field to the database. It only takes in a pointer to that field. However, it requires that the primary key field of the owner of that field be set correctly, to identify the entry that we are modifying. In our case, the owner is grades[0], whose primary key field (grades[0]->grade_id_) is already set from the retrieval query.
It is also possible to load individual fields from the database, instead of retrieving a complete instance:
//we have forgotten a grade grades[0]->grade_grade_.data() = 0.0; //reload it from the database if (!database.loadFromDatabase( &(grades[0]->grade_grade_) ) ) std::cerr << "Failed to load grade field from database\n"; else std::cerr << "Grade field (re)loaded, its value is " << grades[0]->grade_grade_.data() << "\n";
Just as saveToDatabase(...), loadFromDatabase(...) requires that the primary key field of the owner of that field be set correctly, to identify the entry that we are reading from.
Adding a new entry to the database
Let's now create an entry from scratch.
Grade new_grade; new_grade.student_id_.data() = 1; new_grade.grade_subject_.data() = "astrology"; new_grade.grade_grade_.data() = 4.0; new_grade.grade_id_.data() = 4; if (!database.insertIntoDatabase(&new_grade)) std::cerr << "Grade insertion failed\n"; else std::cerr << "Grade insertion succeeded\n";
Notes:
this will work the first time around, but fail if you execute it again. The reason is that the database does not allow two grades to have the same primary key, in our case the grade_id_ field. You should get a descriptive error message at the console if you try it.
this example requires us to fill in the primary key (grade_id_) by hand. This is cumbersome, as it requires us to know what values are already taken. In the advanced tutorials we will be using a sequence to do this automatically.
even if you change the grade_id_, the database will not allow you to insert another grade for the same combination of student_id and grade_subject, This is due to the UNIQUE constraint on the combination of these two fields. You should get a descriptive error message at the console if you try it.
Compiling and running the code
To compile the code, add the following entry in the CMakeLists.txt file in your package:
rosbuild_add_executable(students src/students.cpp)
Type make to compile, followed by bin/students to run. You should get the following output:
Database connected successfully Retrieved 2 student(s) Students: Smith, John: GPA: 3 Major(s): economics, agriculture Foo, Jimmy: GPA: 1.5 Major(s): communication Student with id 1 has 2 grade(s) on record Grade modified successfully Grade insertion succeeded
The finishing touches
After all that grade modification, let's recompute the GPA of the students. But wait! This is a SQL database, no need to do it from C++ code. We can just run an SQL query directly on the database:
UPDATE student SET student_gpa = ( SELECT avg(grade_grade) FROM grade WHERE grade.student_id = student.student_id )
For now, there is no mechanism for executing generic SQL queries using the C++ interface, but this feature will be added in future releases. However, you can manipulate your data by running native SQL queries directly on your database server, an option which we found tremendously valuable when managing our own data sets.
Further reading
Proceed to the Advanced Tutorial next for more options available when using the database interface.