| Home Page | Web Logs at CCIL |
|
Dec_24_2008 Alternate formulation is something like... Need to verify "Average" and simplify (if possible). Let r = radix n = number of bits And i - log_2 (r) j = the number of bytes needed to make a character Then Best = ceil(i)/jn Average = (ceil(i) + 2 - 2^(1 + ceil(i))/r))/jn Worst = floor(i)/jn Link for this entry: http://home.ccil.org/~remlaps/weblog_2008/programming_index.html#ThesisFormula24Dec08 |
|
Dec_22_2008 All representations of all numbers stored in a 3D matrix. X axis = base Y axis = numeral Z axis = digits (index 0 = ones place) scp44@tux64-07:~/programming/cpp$ cat dnum.cpp #include "NumSpace.h" #include Link for this entry: http://home.ccil.org/~remlaps/weblog_2008/programming_index.html#NumberSpace22Dec08 |
|
Dec_13_2008 Read later. From December's Journal of the ACM... "Linear-time disk-based implicit graph search" - Richard E. Korf http://doi.acm.org/10.1145/1455248.1455250 Link for this entry: http://home.ccil.org/~remlaps/weblog_2008/programming_index.html#DiskGraphSearch12Dec08 |
|
Oct_09_2008 In computer graphics last week, we learned about "scaling matrices". With 2d, heterogeneous coordinates, they look like this... sx 0 0 0 sy 0 0 0 1 With 3d, heterogeneous coordinates, they look like this... sx 0 0 0 0 sy 0 0 0 0 sz 0 0 0 0 1 I assume there are higher dimensional scaling matrices following the same pattern too. Of course, that reminded me of the base conversion matrices I played with a while back. Apparently, my base conversion matrix when the source_base is a multiple of the target_base is a scaling matrix. See here...
>> A=[4 0 0;0 2 0;0 0 1]
A =
4 0 0
0 2 0
0 0 1
>> P=[4 3 2]
P =
4 3 2
>> P * A
ans =
16 6 2
>> 16 * 25 + 6 * 5 + 2
ans =
432
Sx = 4 and Sy = 2. To complete the conversion, we would "normalize" the 16 and the 6 by carrying, carrying 25 from the 2nd place to the first, 16,6,2 becomes 17,1,2 then carrying three 125's from the first place to create a new first place gives 3,2,1,2 which is
>> 3 * 125 + 2 * 25 + 1 * 5 + 2
ans =
432
None of that is new, except I didn't know that there was such a thing as a "scaling" matrix before. Now I'm wondering if there are any special geometric properties of these base conversion matrices with increasing powers on the diagonal. I'm also wondering about the similar matrix, when the target is a multiple of the source. i.e.
>> R=[1 0 0;0 2 0;0 0 4]
R =
1 0 0
0 2 0
0 0 4
Since the 4 would line up with a "1" in heterogeneous coordinates, it's hard to imagine what that does, graphically (and maybe that explains why I had to divide the 4 back out, when doing base conversions). That would be somehow similar too...
R' =
1/4 0 0
0 2/4 0
0 0 4/4
Which would be a scaling matrix. Shrinking it, instead of growing it. Link for this entry: http://home.ccil.org/~remlaps/weblog_2008/programming_index.html#ScalingMatrices09Oct08 |
|
Jul_11_2008 Didn't take long to hit my first mismatch between octave and matlab...
printf "Experiment with Random Matrices and Singular Values\n"
function [x] = r(MU, SIGMA, N)
[x] = svd(normrnd(MU,SIGMA,N,N));
endfunction;
A=r(1,1,16);
B=r(1,2,16);
Works in octave, but matlab says... "Function definitions are not permitted at the prompt or in scripts." According to this guy http://www.math.tamu.edu/~comech/tools/octave-basics/ matlab functions must apparently be defined in their own file. What a pain. Link for this entry: http://home.ccil.org/~remlaps/weblog_2008/programming_index.html#MatlabNoFunctions11July07 |
|
Jul_11_2008 Here's an online reference for GNU Octave. Hopefully octave is close enough to matlab for doing my homework assignments, because remote matlab from school is painfully slow. http://www.gnu.org/software/octave/doc/interpreter/ Link for this entry: http://home.ccil.org/~remlaps/weblog_2008/programming_index.html#Octave11July08 |
|
Feb_24_2008 A quick look at corporate banding or academic grading on a curve.... As mentioned before, I was wondering how accurate a commonly used algorithm for ranking people is, so I wrote a little program (computer model ; -) to rank randomly ordered numbers. If we take a population (like 240,000 employees) and split it into sub-groups (like departments of 20 people) then categorize the sub-groups, how does that compare to the categorization over the entire population? Fundamentally, banding (or grading on a curve) is just an attempt at a bucket-sort algorithm. We do have bucket-sort algorithms that work. (It's not brain surgery... anyone who has ever played cards can do it.) Does this one? From program output pasted at the bottom, it looks like (assuming random distribution of merit) applying this algorithm in a hypothetical 240,000 person company split into departments of 20, would misplace somewhere around 2.5% of the employees (12,000) in each of the top-10 and bottom-10 percentage bands. Correspondingly, 1/4 of the people ranked as top-10 or bottom-10 would not belong there if we compared their merit with the entire population. This is assuming every personnel decision along the way is executed perfectly. It looks as if it gets worse as the size of the category or size of the subgroup gets smaller. I did the program logic in my head, while distracted by a 6 year old wielding a water-gun, so errors are likely, but I think maybe it's worth digging into further if I ever find time. C++ source code (as of today) is here: http://home.ccil.org/~remlaps/src/curve.cpp.txt Here are 3 runs with {subgroups=20, category=10%} followed by one with {subgroup=20, category=5}, and one with {subgroup=100, category=10}. C:\Personal\PROGRA~1\cpp>curve Population size?: 240000 Subgroup size?: 20 What percent in high num group?: 10 What percent in low num group?: 10 Ranked high but not: 6114 2.5475% of population 25.475% of high rankings. Ranked low but not: 6380 2.65833% of population 26.5833% of low rankings. C:\Personal\PROGRA~1\cpp>curve Population size?: 240000 Subgroup size?: 20 What percent in high num group?: 10 What percent in low num group?: 10 Ranked high but not: 6215 2.58958% of population 25.8958% of high rankings. Ranked low but not: 6326 2.63583% of population 26.3583% of low rankings. C:\Personal\PROGRA~1\cpp>curve Population size?: 240000 Subgroup size?: 20 What percent in high num group?: 10 What percent in low num group?: 10 Ranked high but not: 6171 2.57125% of population 25.7125% of high rankings. Ranked low but not: 6363 2.65125% of population 26.5125% of low rankings. C:\Personal\PROGRA~1\cpp>curve Population size?: 240000 Subgroup size?: 20 What percent in high num group?: 5 What percent in low num group?: 5 Ranked high but not: 4313 1.79708% of population 35.9417% of high rankings. Ranked low but not: 4411 1.83792% of population 36.7583% of low rankings. C:\Personal\PROGRA~1\cpp>curve Population size?: 240000 Subgroup size?: 100 What percent in high num group?: 10 What percent in low num group?: 10 Ranked high but not: 2930 1.22083% of population 12.2083% of high rankings. Ranked low but not: 3195 1.33125% of population 13.3125% of low rankings Link for this entry: http://home.ccil.org/~remlaps/weblog_2008/programming_index.html#24Feb08 |
|
Feb_14_2008 My code won't handle numbers bigger than 152**4 as written... If we do composite character codes making use of 152 extended ASCII characters, we get to 533,794,816 character possibilities and 90% bit utilization. - rangeRSE(pow(152,4),pow(2,32),32,32); val it = (0.875,0.906069915823,0.90625) : real * real * real - pow(152,4); val it = 533794816 : int - Link for this entry: http://home.ccil.org/~remlaps/weblog_2008/programming_index.html#Composite152-14Feb08 |
|
Feb_14_2008 - 93 * 93 * 93 * 93; val it = 74805201 : int - pow(2,32); val it = 4294967296 : int - rangeRSE(74805201,4294967296,32,32); val it = (0.8125,0.818930315754,0.84375) : real * real * real It seems that we could get > 80% bit utilization and make 74,805,201 characters available by aggregating 4 ASCII (not extended ASCII) characters into a composite character set. I.E. four ASCII characters make a character in the composite character set. "0000" = 0 "0001" = 1 ... "7342" = alpha "7344" = beta ... I wonder how such a scheme would compare to unicode in performance, portability and ease of implementation? Right now, my machine doesn't want to answer the question for 5 consectuive ASCII characters. : -( Link for this entry: http://home.ccil.org/~remlaps/weblog_2008/programming_index.html#CompositeCode14Feb08 |