pixel is in Charge


22.07.2009   04:23

Peering Down the Rabbit Hole with cl-perec

Don't forget to check out the other articles in this series:

  1. Getting Started with cl-perec
  2. Persisting Simple Types with cl-perec
  3. Sensible Serializing with cl-perec
  4. Peering Down the Rabbit Hole with cl-perec (you are here)

As anyone following this guide to cl-perec has undoubtedly noticed, while I've spent considerable time exploring how to put data in the database, I've not spent any on getting all that precious data back out. Today, I shall remedy that.

Let us suppose your PHB wanders by and asks you what the title is of the report located in the file "/share/tps-reports/Testing". Not to worry, you can select a single instance using cl-perec:select-instance.¹

select-instance has a fairly simple calling syntax.

(select-instance (/var-name/ /class-of-instance/)
  (where [condition-form]))

For example, we can select our tps report by file name with:

(cl-perec:with-transaction
  (cl-perec:select-instance (report tps-report)
    (cl-perec:where
      (equal (report-path-of report)
             #p"/share/tps-reports/Testing"))))

Note that accessor methods on the instance variable are used to pick the SQL column, and an instance of the object to which we want to compare that column is used for our value. This is because the value-to-be-compared goes through all the lisp→sql conversion machinery.²

Of particular interest is the where clause, which is proper CL³. The upshot to this is that it's very easy to select based upon an arbitrary predicate. The downside is that, if cl-perec can't figure out how to translate your where clause, it will filter the parts it doesn't understand in Lisp. Failure to remember that may result in accidentally hosing your performance. #'cl-perec:start-sql-recording is very useful here for checking if things are being converted to SQL in the way you expect.

Simple, standard functions, such as 'equal, get translated to the obvious SQL, as do boolean expressions involving AND, NOT, and OR. Notably excepted, however, string= and string-equal are not translated into SQL.

As you might expect, user-defined equality predicates are not converted to SQL. As usual, however, cl-perec offers unexported functionality to extend itself.

Say, for instance, you want to be able to query report-path using wild pathnames, to fetch all the tps-reports in a certain directory, like so:

(cl-perec:with-transaction
  (cl-perec:select-instances (r tps-report)
    (cl-perec:where (pathname-match-p
                      (report-path-of r)
                      #p"/share/tps-reports/*"))))

A legitimate use case if I ever saw one! To make this work in SQL, we need to convert it to use the LIKE operator, or possibly a regular expression. Rummaging around in query/mapping.lisp, we find the definer for like which points us in a direction. Awesome!

A little trial and error gets us to a very rough start.

(cl-def:def (cl-perec::query-function :lisp-args t) pathname-match-p (path wild)
  "Convert a comparison to a :wild pathname into SQL.  Does not support all possible :wild pathnames."
  (declare (cl-perec::persistent-type (cl-perec::forall (a) (function (a a) boolean))))
  (let* ((wild (cl-perec::value-of wild))
         (root (make-pathname :name nil :type nil :defaults wild))
         (name (if (wild-pathname-p wild :name)
                   "%"
                   (pathname-name wild)))
         (type (if (wild-pathname-p wild :type)
                   "%"
                   (pathname-type wild))))
    (cl-perec::sql-like :string (cl-perec::syntax-to-sql path)
                        :pattern (format nil "~a~a~@[.~a~]" root name type)
                        :case-sensitive-p t)))

path and wild are passed in as cl-perec 'syntax nodes'. syntax-to-sql converts a syntax node to SQL, and value-of, assuming a literal value was used, gets that literal value. Then it's just a matter of converting :wilds into SQL LIKE's wildcard, the percent sign (%).

Regrettably, this is pretty fragile. It only works if the wild pathname is passed in as a literal value; neither let-bound variables, special variables, nor #'make-pathname will work. (Not to mention the flaws in the matching itself!) Let's see what we can do about that.

(cl-def:def (cl-perec::query-function :lisp-args t) pathname-match-p (path wild)
  "Convert a comparison to a :wild pathname into SQL.  Does not support all possible :wild pathnames."
  (declare (cl-perec::persistent-type (cl-perec::forall (a) (function (a a) boolean))))
  (cl-perec::sql-like
    :string (cl-perec::syntax-to-sql path)
    :pattern (cl-perec::syntax-to-sql
              (cl-perec::make-special-form
               :operator 'let*
               :operands `(((wild ,wild)
                            (root (make-pathname :name nil :type nil :defaults wild))
                            (name (if (wild-pathname-p wild :name)
                                      "%"
                                      (pathname-name wild)))
                            (type (if (wild-pathname-p wild :type)
                                      "%"
                                      (pathname-type wild))))
                           (format nil "~a~a~@[.~a~]" root name type))))
    :case-sensitive-p t))

This is a bit messy—because the arguments we're given have already been converted into an abstract syntax tree, what would otherwise be fairly straightforward code must be converted into an AST as well—but it works. make-special-form creates an object that eventually gets evaluated, and it somehow all magically works out. The in-sql matching still isn't perfect, but it's pretty useable. Give it a try!

(cl-perec:with-transaction
  (cl-perec:select-instances (r tps-report)
    (cl-perec:where
      (pathname-match-p (report-path-of r)
                        #p"/share/tps-reports/*"))))
(cl-perec:with-transaction
  (cl-perec:select-instances (r tps-report)
    (cl-perec:where
      (pathname-match-p (report-path-of r)
                        (make-pathname :directory '(:absolute "share" "tps-reports") :name :wild)))))
(let ((w (make-pathname :directory '(:absolute "share" "tps-reports") :name :wild)))
  (cl-perec:with-transaction
    (cl-perec:select-instances (r tps-report)
      (cl-perec:where
        (pathname-match-p (report-path-of r)
                          w)))))
(defvar *wild* (make-pathname :directory '(:absolute "share" "tps-reports") :name :wild))
(cl-perec:select-instances (r tps-report)
  (cl-perec:where
    (pathname-match-p (report-path-of r)
                      *wild*)))

All should produce SQL that looks something like this:

 BEGIN
 $1 = /share/tps-reports/% as TEXT
 SELECT r._oid, r.report_title, r.report_text, r.report_path FROM tps_report_ap r WHERE ((r.report_path LIKE $1::TEXT))
 COMMIT

Pathname matching that handles directories with :wild and :wild-inferiors properly is left as an exercise to the reader, but I'll leave you with a few hints: look into the re-like query-function, which provides you with access to the database server's regular expression engine; and ignore the fact you can't portably rely on the format of namestrings.

Footnotes

  1. You can also get multiple objects using select-instances which, other than the pluralization, has identical syntax to select-instance. You'll see that in action a little later.
  2. You could write your lisp→sql conversion function to pass through strings and avoid the temporary object, but presumably you're using an object instead of a string because it buys you something.
  3. There is, however, a subtle caveat: the parts of an expression cl-perec can convert to SQL are (quite reasonably) not verified in Lisp land. This means comparison functions can return different results depending on whether they were applied via SQL or in Lisp.
    (with-transaction
      (values
        (select-instance (r tps-report)
          (where (eql (report-title-of r) "Testing P")))
        (select-instance (r tps-report)
          (where (funcall #'eql (report-title-of r) "Testing P")))))
    
    #<TPS-REPORT …>, NIL
  4. I assume there's an easier way and I just haven't figured it out yet. This is scary far into the depths of cl-perec and all the meta hurts my little brain.

21.07.2009   00:17

Sensible Serializing with cl-perec

Don't forget to check out the other articles in this series:

  1. Getting Started with cl-perec
  2. Persisting Simple Types with cl-perec
  3. Sensible Serializing with cl-perec (you are here)
  4. Peering Down the Rabbit Hole with cl-perec

If a class slot contains an object, cl-perec will do its best to persist that object into the database using cl-serializer. Unfortunately, that persistence leaves something to be desired: namely, interoperability with non-lisp systems. I mean, binary strings and type codes? Ew!¹

Fortunately, because we know our data fairly well, we're going to serialize without the hassle of type codes. First, let's revisit our TPS-REPORT class, and add a pathname² at which that report can be found.

WARNING! Don't execute this yet, or you'll have to restart your lisp image.³

(cl-perec:defpclass* tps-report ()
  ((report-title :type boring-string)
   (report-text  :type even-string)
   (report-path  :type pathname)))

Calling back upon cl-perec::defmapping, which we discovered earlier, we map the pathname type to an sql character type.

(cl-perec::defmapping pathname
  (cl-rdbms::sql-character-large-object-type)
  'pathname-sql-reader
  'pathname-sql-writer)

pathname-sql-reader and -writer don't exist, so we'll have to make them. They aren't terribly complicated functions: all they do is turn a pathname into a string and vice versa. Actually, that sounds kinda like some functions that already exist in Common Lisp, making our job pretty easy.

The reader function is supposed to take a sequence and an index, then get its value from that sequence starting at index. The writer function takes a value to convert, and a sequence plus index where it is supposed to store the value.

(defun pathname-sql-reader (rdbms-values index)
  (pathname (elt rdbms-values index)))
(defun pathname-sql-writer (slot-value rdbms-values index)
  (setf (elt rdbms-values index)
        (namestring slot-value)))

Exciting, no? But there's one more thing we need to do: ensure cl-perec will actually use our mapping.

(pushnew 'pathname cl-perec::*mapped-type-precedence-list*)

Note this is a different list from what we used when we were defining types. I don't know why.

Our mapping defined, it's safe to redefine the tps-report class as above. Redefining the class earlier would have caused the pathname type mapping to be set, after which changing the mapping has no effect.³

That done, we can point our tps-report records to their corresponding file.

(cl-perec:with-transaction
  (make-instance 'tps-report
                 :report-title "Testing P"
                 :report-text "This is a test"
                 :report-path #p"/share/tps-reports/Testing"))
test_db=> select * from tps_report;
_oidreport_titlereport_textreport_path
4752604Testing PThis is a test/share/tps-reports/Testing

And, best of all, non-lisp code can still make sense of report_path!

cl-perec offers an incredible amount of flexibility in how your data is mapped between your lisp image and the database, upon which we have barely scratched the surface. Just remember that with great power comes great responsibility, and probably a few bugs. Also remember that order matters, so if something isn't working you think should be, execution order may be to blame.

Footnotes

  1. As I've mentioned before, the entire point of using an ORM rather than an object database is so other systems in other languages can play along.
  2. Strangely, pathnames are not supported by cl-serializer, so we'd have to jump through these hoops anyway. But it applies just as well to other objects.
  3. cl-perec seems to cache computed values a little too aggressively and I haven't figured out how to force it to recompute things. Unfortunately, this means I end up restarting my lisp pretty frequently during development and that's a pretty painful way to mold a codebase.
  4. I'm told you can use the index to do funky things like splitting a value across multiple columns. I haven't found myself with the desire for that and so leave that particular exercise to the reader (for now, at least).
  5. Specifically, the pathname type would be mapped to the t class, causing pathnames to run through the cl-serializer machinery and eventually error out during a database write because cl-serializer doesn't support pathname serialization.

Continue to Peering Down the Rabbit Hole with cl-perec