Asked By – AP257
What the difference is between
commit() in SQLAlchemy?
I’ve read the docs, but am none the wiser – they seem to assume a pre-understanding that I don’t have.
I’m particularly interested in their impact on memory usage. I’m loading some data into a database from a series of files (around 5 million rows in total) and my session is occasionally falling over – it’s a large database and a machine with not much memory.
I’m wondering if I’m using too many
commit() and not enough
flush() calls – but without really understanding what the difference is, it’s hard to tell!
Now we will see solution for issue: SQLAlchemy: What’s the difference between flush() and commit()?
A Session object is basically an ongoing transaction of changes to a database (update, insert, delete). These operations aren’t persisted to the database until they are committed (if your program aborts for some reason in mid-session transaction, any uncommitted changes within are lost).
The session object registers transaction operations with
session.add(), but doesn’t yet communicate them to the database until
session.flush() is called.
session.flush() communicates a series of operations to the database (insert, update, delete). The database maintains them as pending operations in a transaction. The changes aren’t persisted permanently to disk, or visible to other transactions until the database receives a COMMIT for the current transaction (which is what
session.commit() commits (persists) those changes to the database.
flush() is always called as part of a call to
When you use a Session object to query the database, the query will return results both from the database and from the flushed parts of the uncommitted transaction it holds. By default, Session objects
autoflush their operations, but this can be disabled.
Hopefully this example will make this clearer:
#--- s = Session() s.add(Foo('A')) # The Foo('A') object has been added to the session. # It has not been committed to the database yet, # but is returned as part of a query. print 1, s.query(Foo).all() s.commit() #--- s2 = Session() s2.autoflush = False s2.add(Foo('B')) print 2, s2.query(Foo).all() # The Foo('B') object is *not* returned # as part of this query because it hasn't # been flushed yet. s2.flush() # Now, Foo('B') is in the same state as # Foo('A') was above. print 3, s2.query(Foo).all() s2.rollback() # Foo('B') has not been committed, and rolling # back the session's transaction removes it # from the session. print 4, s2.query(Foo).all() #--- Output: 1 [<Foo('A')>] 2 [<Foo('A')>] 3 [<Foo('A')>, <Foo('B')>] 4 [<Foo('A')>]