Failure Details: Get Quick results using "Row Locator"s (fileid+pageid+rowid)

Hi.
I've posted this to the SQL wish center (Connect/sqlserver/feedback) as a
suggested new feature.
(https://connect.microsoft.com/SQLSe...=281
531).
Would appreciate your comments and thoughts towards feasibility and
usability.
Suggested feature post as below:
Description
Processing queries involves many steps by the query processing engine.
Eventually the row containing the data queried for is found, i.e. either
involving few steps, if an index is used (generally preferred scenario) or
in case of searching the table for every row (like a heap) if no index is
used (worst case scenario).
If similar queries are repeated, sql server may use the cache to deliver
results faster. However we all know this "quickness" may not be reliable in
very large tables, limited resources or fairly big timegaps (due to cache
timeouts).
It would be great if we could use the "row locator" which is a combination
of fileid, page id and the row id of the row provided as a 'hint' within the
query (like we do for indexes) and get the desired row(s). This would
certainly be a great advantage specifically in searching for columns which
are PKs or have uniquely constrained indexes, or select top 1s.
I feel this feature could have a significant improvement in the time it
takes to get the result especially for a repeat query. Common scenarios
include queries for updates or deletes especially in disconnected scenarios
more common in web applications.
Proposed Solution
Two parts to this Solution:
Part 1: Have a new table linked function called table.GetRowLocatorKey() or
similar construct/syntax to get the row locator (fileid,pageid,rowid) key
for the row.
i.e.
select table.GetRowLocatorKey, <columns> from <table>
Part 2: For queries allow rowlocator hint(s) that could be provided as part
of the query where one or more row locators specified for a table could be
checked first by the query processor for results of the query like so:
Select <columns> from <table> where <conditions>
RowLocationHints:Table(rowlocation1,rowl
ocation2,rowlocation3)
The query processor would check the row locations first. If a single result
is to be found incase of a select top 1 or conditions involving a column
with unique index / PK then it could stop and return the result.
If the result is not found on the row location (incase of updates or
deleted) or if the situation doesnot involve select top 1s or unique index /
PK then the query processor should continue as it normally would.
regards
ManasvinOn Jun 6, 1:23 pm, "Manasvin" <piyush-at-manasvin-dot-com> wrote:
> It would be great if we could use the "row locator" which is a combination
> of fileid, page id and the row id of the row provided as a 'hint' within t
he
> query (like we do for indexes) and get the desired row(s).
Translation: Allow us to throw away the relational model and use a
network model database instead.
Ain't gonna happen.
It *might* have utility in a very few performance-related scenarios.
But 99% of the situations that use SQL Server would obtain absolutely
zero benefit.
If you want a network model database, use one. Leave SQL Server alone.|||> It would be great if we could use the "row locator" which is a combination
> of fileid, page id and the row id of the row provided as a 'hint' within
> the query (like we do for indexes) and get the desired row(s). This would
> certainly be a great advantage specifically in searching for columns which
> are PKs or have uniquely constrained indexes, or select top 1s.
Part of the beauty of a relational system is that the physical location of a
piece of data is abstracted from us. How are you going to know the fileid,
page id and row id of a particular row? How expensive is that part of the
lookup going to be? And even now that you have it, how do you know it will
be in the same physical location next week, tomorrow, or even in five
minutes?|||I think this feature is NOT to suggest moving away from a relational system
but a smarter one. A hint based pattern which is already being used on
various other aspects including choosing an index for instance withing a
query. The idea is that requery for the same record or row shouldnt take the
same effort regardless of cache hits or misses.
In a typical scenario which I believe happens often illustrated below:
A record a is queried for viewing.
Its at this stage the query process would anyways be able to accumalate the
rowlocation since its got there to collect the data row anyways.
the application which queried for this data and is now displaying the row(s)
could maintain the rowlocations keys (and which are not meant to be used as
permanent or static values). most times the application has disconnected and
the RDMBS is busy to serve other applications and their queries
Now if there is a requery or an update / delete to the above said row(s) the
rowlocation(s) could be supplied for these specific queries as a 'hint only'
but not to undermine the consistency or reliability of the query process,
rather a smarter way just get to the data quickly. in any case this hint(s)
may not be any good but could very well be enough to make a difference.
i believe the above sequence of application events or actions do happen
quite commonly.amongst many application if not most.
I hope the scenario above makes things a bit clearer. Am I the only one who
sees this as a very useful feature ?
regards
Piyush
"Aaron Bertrand [SQL Server MVP]" <ten.xoc@.dnartreb.noraa> wrote in mess
age
news:Oxvw2NGqHHA.3512@.TK2MSFTNGP06.phx.gbl...
> Part of the beauty of a relational system is that the physical location of
> a piece of data is abstracted from us. How are you going to know the
> fileid, page id and row id of a particular row? How expensive is that
> part of the lookup going to be? And even now that you have it, how do you
> know it will be in the same physical location next week, tomorrow, or even
> in five minutes?
"rpresser" <rpresser@.gmail.com> wrote in message
news:1181151962.535269.172900@.z28g2000prd.googlegroups.com...
> On Jun 6, 1:23 pm, "Manasvin" <piyush-at-manasvin-dot-com> wrote:
> Translation: Allow us to throw away the relational model and use a
> network model database instead.
> Ain't gonna happen.
> It *might* have utility in a very few performance-related scenarios.
> But 99% of the situations that use SQL Server would obtain absolutely
> zero benefit.
> If you want a network model database, use one. Leave SQL Server alone.
>|||On Jun 6, 12:23 pm, "Manasvin" <piyush-at-manasvin-dot-com> wrote:
> Hi.
> I've posted this to the SQL wish center (Connect/sqlserver/feedback) as a
> suggested new feature.
> (https://connect.microsoft.com/SQLSe...back.aspx?Fe...)
.
> Would appreciate your comments and thoughts towards feasibility and
> usability.
> Suggested feature post as below:
> Description
> Processing queries involves many steps by the query processing engine.
> Eventually the row containing the data queried for is found, i.e. either
> involving few steps, if an index is used (generally preferred scenario) or
> in case of searching the table for every row (like a heap) if no index is
> used (worst case scenario).
> If similar queries are repeated, sql server may use the cache to deliver
> results faster. However we all know this "quickness" may not be reliable i
n
> very large tables, limited resources or fairly big timegaps (due to cache
> timeouts).
> It would be great if we could use the "row locator" which is a combination
> of fileid, page id and the row id of the row provided as a 'hint' within t
he
> query (like we do for indexes) and get the desired row(s). This would
> certainly be a great advantage specifically in searching for columns which
> are PKs or have uniquely constrained indexes, or select top 1s.
> I feel this feature could have a significant improvement in the time it
> takes to get the result especially for a repeat query. Common scenarios
> include queries for updates or deletes especially in disconnected scenario
s
> more common in web applications.
> Proposed Solution
> Two parts to this Solution:
> Part 1: Have a new table linked function called table.GetRowLocatorKey() o
r
> similar construct/syntax to get the row locator (fileid,pageid,rowid) key
> for the row.
> i.e.
> select table.GetRowLocatorKey, <columns> from <table>
> Part 2: For queries allow rowlocator hint(s) that could be provided as par
t
> of the query where one or more row locators specified for a table could be
> checked first by the query processor for results of the query like so:
> Select <columns> from <table> where <conditions>
> RowLocationHints:Table(rowlocation1,rowl
ocation2,rowlocation3)
> The query processor would check the row locations first. If a single resul
t
> is to be found incase of a select top 1 or conditions involving a column
> with unique index / PK then it could stop and return the result.
> If the result is not found on the row location (incase of updates or
> deleted) or if the situation doesnot involve select top 1s or unique index
/
> PK then the query processor should continue as it normally would.
> regards
> Manasvin
It's been suggested and even implemented before (Oracle). Search for
"Invalid ROWID" and "ROWID problem" in Oracle newsgroups and start
learning why this idea is not as smart as you think.|||You are absolutely correct Piyush, these actions as you describe them happen
ALL the time! They are called using a PRIMARY/UNIQUE KEY value that is
INDEXED as the lookup for the UPDATE/DELETE for the row originally accessed.
And since the data page associated with this particular row (and it's PK/UK
index) will probably still be in RAM (due to SQL Server's incredibly
effective caching algorithms) this subsequent lookup for the DML statement
will likely occur within a few milliseconds tops.
TheSQLGuru
President
Indicium Resources, Inc.
"Manasvin" <piyush-at-manasvin-dot-com> wrote in message
news:uGW5pqHqHHA.1240@.TK2MSFTNGP04.phx.gbl...
>I think this feature is NOT to suggest moving away from a relational system
>but a smarter one. A hint based pattern which is already being used on
>various other aspects including choosing an index for instance withing a
>query. The idea is that requery for the same record or row shouldnt take
>the same effort regardless of cache hits or misses.
> In a typical scenario which I believe happens often illustrated below:
> A record a is queried for viewing.
> Its at this stage the query process would anyways be able to accumalate
> the rowlocation since its got there to collect the data row anyways.
> the application which queried for this data and is now displaying the
> row(s) could maintain the rowlocations keys (and which are not meant to be
> used as permanent or static values). most times the application has
> disconnected and the RDMBS is busy to serve other applications and their
> queries
> Now if there is a requery or an update / delete to the above said row(s)
> the rowlocation(s) could be supplied for these specific queries as a 'hint
> only' but not to undermine the consistency or reliability of the query
> process, rather a smarter way just get to the data quickly. in any case
> this hint(s) may not be any good but could very well be enough to make a
> difference.
> i believe the above sequence of application events or actions do happen
> quite commonly.amongst many application if not most.
> I hope the scenario above makes things a bit clearer. Am I the only one
> who sees this as a very useful feature ?
> regards
> Piyush
> "Aaron Bertrand [SQL Server MVP]" <ten.xoc@.dnartreb.noraa> wrote in
> message news:Oxvw2NGqHHA.3512@.TK2MSFTNGP06.phx.gbl...
>
> "rpresser" <rpresser@.gmail.com> wrote in message
> news:1181151962.535269.172900@.z28g2000prd.googlegroups.com...
>|||having read a lot bit about like you said in google, I believe there is a
big difference in the way implementation of this feature is being suggested.
A less hard and more flexible approach can easily be worked out. also kindly
my secondary post attached to the thread
"Alex Kuznetsov" <AK_TIREDOFSPAM@.hotmail.COM> wrote in message
news:1181163417.397905.229550@.g37g2000prf.googlegroups.com...
> On Jun 6, 12:23 pm, "Manasvin" <piyush-at-manasvin-dot-com> wrote:
> It's been suggested and even implemented before (Oracle). Search for
> "Invalid ROWID" and "ROWID problem" in Oracle newsgroups and start
> learning why this idea is not as smart as you think.
>|||yes and all I am saying then is that cache may not be dependable in larger
time gaps or very big table sets. infact it could be faster than using the
cache and in a high load scenario could make a significant difference for
better.
"TheSQLGuru" <kgboles@.earthlink.net> wrote in message
news:OyZy88HqHHA.196@.TK2MSFTNGP05.phx.gbl...
> You are absolutely correct Piyush, these actions as you describe them
> happen ALL the time! They are called using a PRIMARY/UNIQUE KEY value
> that is INDEXED as the lookup for the UPDATE/DELETE for the row originally
> accessed. And since the data page associated with this particular row (and
> it's PK/UK index) will probably still be in RAM (due to SQL Server's
> incredibly effective caching algorithms) this subsequent lookup for the
> DML statement will likely occur within a few milliseconds tops.
> --
> TheSQLGuru
> President
> Indicium Resources, Inc.
> "Manasvin" <piyush-at-manasvin-dot-com> wrote in message
> news:uGW5pqHqHHA.1240@.TK2MSFTNGP04.phx.gbl...
>|||On 6 Jun, 22:19, "Manasvin" <piyush-at-manasvin-dot-com> wrote:
> yes and all I am saying then is that cache may not be dependable in larger
> time gaps or very big table sets. infact it could be faster than using the
> cache and in a high load scenario could make a significant difference for
> better.
>
The whole suggestion is wrong-headed. Performance is determined solely
by the physical implementation, to which the presence or absence of a
"row locator" adds little or nothing. The advantages of exposing a
physical row locator are tiny when compared to other engine-level
enhancements that could be made but the disadvantages are enormous.
If you want real improvements then let's suggest better support for
Data Independence in the engine.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||On 6 Jun, 21:38, "Manasvin" <piyush-at-manasvin-dot-com> wrote:
> I think this feature is NOT to suggest moving away from a relational syste
m
> but a smarter one.
In that case I suggest you don't know what a relational system is.

> Now if there is a requery or an update / delete to the above said row(s) t
he
> rowlocation(s) could be supplied for these specific queries as a 'hint onl
y'
> but not to undermine the consistency or reliability of the query process,
> rather a smarter way just get to the data quickly. in any case this hint(s
)
> may not be any good but could very well be enough to make a difference.
> i believe the above sequence of application events or actions do happen
> quite commonly.amongst many application if not most.
>
This sounds like a server keyset-based cursor. There is absolutely no
need to return a row locator to the client in order to achieve that.
Let the DBMS handle it. You could I suppose have a hint that pinned
the set of rows in cache, but on the whole SQL Server is pretty good
at cache anyway.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--

Friday, February 24, 2012

Get Quick results using "Row Locator"s (fileid+pageid+rowid)

No comments:

Post a Comment

Failure Details

Blog Archive

About Me