Friday, February 24, 2012

Get Quick results using "Row Locator"s (fileid+pageid+rowid)

> It would be great if we could use the "row locator" which is a combination
> of fileid, page id and the row id of the row provided as a 'hint' within
> the query (like we do for indexes) and get the desired row(s). This would
> certainly be a great advantage specifically in searching for columns which
> are PKs or have uniquely constrained indexes, or select top 1s.
Part of the beauty of a relational system is that the physical location of a
piece of data is abstracted from us. How are you going to know the fileid,
page id and row id of a particular row? How expensive is that part of the
lookup going to be? And even now that you have it, how do you know it will
be in the same physical location next week, tomorrow, or even in five
minutes?
You are absolutely correct Piyush, these actions as you describe them happen
ALL the time! They are called using a PRIMARY/UNIQUE KEY value that is
INDEXED as the lookup for the UPDATE/DELETE for the row originally accessed.
And since the data page associated with this particular row (and it's PK/UK
index) will probably still be in RAM (due to SQL Server's incredibly
effective caching algorithms) this subsequent lookup for the DML statement
will likely occur within a few milliseconds tops.
TheSQLGuru
President
Indicium Resources, Inc.
"Manasvin" <piyush-at-manasvin-dot-com> wrote in message
news:uGW5pqHqHHA.1240@.TK2MSFTNGP04.phx.gbl...
>I think this feature is NOT to suggest moving away from a relational system
>but a smarter one. A hint based pattern which is already being used on
>various other aspects including choosing an index for instance withing a
>query. The idea is that requery for the same record or row shouldnt take
>the same effort regardless of cache hits or misses.
> In a typical scenario which I believe happens often illustrated below:
> A record a is queried for viewing.
> Its at this stage the query process would anyways be able to accumalate
> the rowlocation since its got there to collect the data row anyways.
> the application which queried for this data and is now displaying the
> row(s) could maintain the rowlocations keys (and which are not meant to be
> used as permanent or static values). most times the application has
> disconnected and the RDMBS is busy to serve other applications and their
> queries
> Now if there is a requery or an update / delete to the above said row(s)
> the rowlocation(s) could be supplied for these specific queries as a 'hint
> only' but not to undermine the consistency or reliability of the query
> process, rather a smarter way just get to the data quickly. in any case
> this hint(s) may not be any good but could very well be enough to make a
> difference.
> i believe the above sequence of application events or actions do happen
> quite commonly.amongst many application if not most.
> I hope the scenario above makes things a bit clearer. Am I the only one
> who sees this as a very useful feature ?
> regards
> Piyush
> "Aaron Bertrand [SQL Server MVP]" <ten.xoc@.dnartreb.noraa> wrote in
> message news:Oxvw2NGqHHA.3512@.TK2MSFTNGP06.phx.gbl...
>
> "rpresser" <rpresser@.gmail.com> wrote in message
> news:1181151962.535269.172900@.z28g2000prd.googlegr oups.com...
>
|||On 6 Jun, 22:19, "Manasvin" <piyush-at-manasvin-dot-com> wrote:
> yes and all I am saying then is that cache may not be dependable in larger
> time gaps or very big table sets. infact it could be faster than using the
> cache and in a high load scenario could make a significant difference for
> better.
>
The whole suggestion is wrong-headed. Performance is determined solely
by the physical implementation, to which the presence or absence of a
"row locator" adds little or nothing. The advantages of exposing a
physical row locator are tiny when compared to other engine-level
enhancements that could be made but the disadvantages are enormous.
If you want real improvements then let's suggest better support for
Data Independence in the engine.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
|||On 6 Jun, 21:38, "Manasvin" <piyush-at-manasvin-dot-com> wrote:
> I think this feature is NOT to suggest moving away from a relational system
> but a smarter one.
In that case I suggest you don't know what a relational system is.

> Now if there is a requery or an update / delete to the above said row(s) the
> rowlocation(s) could be supplied for these specific queries as a 'hint only'
> but not to undermine the consistency or reliability of the query process,
> rather a smarter way just get to the data quickly. in any case this hint(s)
> may not be any good but could very well be enough to make a difference.
> i believe the above sequence of application events or actions do happen
> quite commonly.amongst many application if not most.
>
This sounds like a server keyset-based cursor. There is absolutely no
need to return a row locator to the client in order to achieve that.
Let the DBMS handle it. You could I suppose have a hint that pinned
the set of rows in cache, but on the whole SQL Server is pretty good
at cache anyway.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
|||Do you think a 'row-locator' would be 'cached' later either? Physical I/O
is responsible for at LEAST 80% of the performance issues most database
applications have. How would the engine use this row-locator to get to the
actual row of data stored on some 8K datapage? SOMEHOW, SOMEWAY, some
physical lookup is gonna be required. That information won't be in cache
any longer or more likely than the index page(s) will! Also you are
Completely ignoring the issue of what happens when someone else updates the
row before you try to and, due to making a varchar column value larger that
row no longer fits in the same row-locator slot. Oopsie!! You just had an
error get thrown when you tried to update missing data. Wait, it gets even
better. Say someone did an insert during this delay and the engine placed a
NEW row in that same row-locator slot. Now it is even worse, because you
update the wrong row.
Do yourself a favor and drop this line of thinking. It is REALLY, REALLY
bad from a number of standpoints. :-)
TheSQLGuru
President
Indicium Resources, Inc.
"Manasvin" <piyush-at-manasvin-dot-com> wrote in message
news:ugUkQBIqHHA.3312@.TK2MSFTNGP05.phx.gbl...
> yes and all I am saying then is that cache may not be dependable in larger
> time gaps or very big table sets. infact it could be faster than using the
> cache and in a high load scenario could make a significant difference for
> better.
> "TheSQLGuru" <kgboles@.earthlink.net> wrote in message
> news:OyZy88HqHHA.196@.TK2MSFTNGP05.phx.gbl...
>
|||"Aaron Bertrand [SQL Server MVP]" <ten.xoc@.dnartreb.noraa> wrote in message
news:Oxvw2NGqHHA.3512@.TK2MSFTNGP06.phx.gbl...
>.
> Part of the beauty of a relational system is that the physical location of
> a piece of data is abstracted from us. How are you going to know the
> fileid, page id and row id of a particular row? .
You have you head inside when it should be outside. From an application
developers
point of view what your describing is a KEY.
Indexer Expression
http://www.alphora.com/docs/O-System.iIndexer.html
www.beyondsql.blogspot.com

No comments:

Post a Comment