Wednesday, March 7, 2012

Get some values from a group

Hello,

here's my problem. There's table t with:

year, id, price
2000 1 100
2000 1 200
2000 2 100
2000 3 500
2000 4 100
2001 1 100
2001 2 300
...

I need a way, to get those prices, so I have e.g. 3 Groups for every year with the same number of elements in it. Here's the bakground:
My employer wants a report for the sold objects in each year classified by upper, middle and lower price level. The biggest problem is, that I cannot define prices manually. What I have to do, is to order prices in each group (e.g. year), combine that with row_number(), devide row_count by 3, do a loop with step (row_number / 3) and get price at that position.
But getting those prices gives me the willies. I just discoverd the "MODEL" Clause. Do you think, that can help me solve the problem?
Maybe you have some tips.
Thanks!sorry, anything with "do a loop" in the requirements is not going to get a lot of responses on this site

we normally suggest that you restate your requirements in terms of what data you want, rather than how to get it (especially if how to get it involves cursors)|||nevermind, I think ntile is what i'm looking for ;-)|||The statistically correct way to do what you've described using sets is:CREATE TABLE #p (
yearID INT
, ID INT
, price MONEY
)

INSERT INTO #p (
yearID, ID, price
) SELECT 2000, 1, 100
UNION ALL SELECT 2000, 1, 200
UNION ALL SELECT 2000, 2, 100
UNION ALL SELECT 2000, 3, 500
UNION ALL SELECT 2000, 4, 100
UNION ALL SELECT 2001, 1, 100
UNION ALL SELECT 2001, 2, 300

SELECT a.yearID, a.ID, a.price
, 1 + Convert( INT, (SELECT Count(*)
FROM #p AS b
WHERE b.price < a.price)
/ (SELECT Count(*) / 3.0 FROM #p AS c))
FROM #p AS a

DROP TABLE #pNote that the conversion back to INT (forcing truncation) is very important from a statistical perspective.

-PatP|||Thank you Pat! I could make it with ntile

select year,pos,min(price) from
(
select
ntile(4) over(partition by year order by price) as pos,
year,
price
from
(select distinct id,year,price from #t ) as a
)as b
group by pos,year

This works fine, because I dont need to know anything about the data before. Anyway, this query doesn't look optimal. Is there a way to reduce selects?
Thx!

No comments:

Post a Comment