The problem of removing the data from the matrix
1 回表示 (過去 30 日間)
古いコメントを表示
I have an array including about 100 numbers.We assume the number is from 1 to 100.I want to get 7 numbers from the array, then calculate the sum of these numbers. If the sum value is in the range of two known number.We get these seven number and dispaly.oterwise we go on to find the sum value between the known number until all the arrangement method have been tried. I want to get the matlab programming code to complete it,i really don't have a good method. Look forward to you help.Thank you!
4 件のコメント
Walter Roberson
2011 年 11 月 17 日
This is a technical forum for civil discussions. Personal arguments or justifications will be removed.
Jan
2011 年 11 月 17 日
@Walter: To be exact, this is a technical forum for questions related to Matlab.
@chengxuan yu: I do not answer questions sent by email. I do answer here in this forum.
Of course you are allowed to say what you want. And the editors have the power to remove it according to these guidelines:
1. Personal attacs
2. Off-topic content not related to MATLAB or any MathWorks products
Walter keeps the forum clean. He is assiduous and careful. I like it.
回答 (6 件)
Andrei Bobrov
2011 年 11 月 7 日
variant
d = randperm(100).';
idx = bsxfun(@plus,1:7,(0:numel(d)-7)');
d1 = d(idx);
ou1 = sum(d1,2);
bnd = [250 300] % Interval borders
[c c] = histc(ou1,bnd);
out = d1(find(c,1,'first'),:)
simple variant
d = randperm(100).'; % your data
n = 7;
bnd = [250 300];
n1 = 0:n-1;
for j1 = 1:numel(d)-n+1
i1 = j1 + n1;
s1 = sum(d(i1));
if s1 >= bnd(1) && s1 <= bnd(2)
out = d(i1);
break
end
end
2 件のコメント
Fangjun Jiang
2011 年 11 月 8 日
The following code could solve your problem if you have enough memory. N is set at 10 first so you can try and understand the code. nchoosek(10,7) is 120 combinations. Among them, anywhere from 20 to 80 combinations might meet the criteria depending on the random data. nchoosek(100,7) will be 1.6008e+010 combinations. I don't think a personal computer at this age will have 16*8*8G memory so maybe this solution is not practical. In the meantime, take a look to see if this is a solution (for a small scale). I'll think about if there is a way to not to use the huge array, maybe use a long for-loop. Someone else might chip in too.
N=10;
K=7;
V_min=3;
V_max=4;
Data=rand(N,1);
Index=nchoosek(1:N,K);
M=size(Index,1);
Flag=false(M,1);
for j=1:M
Select=Data(Index(j,:));
Total=sum(Select);
Flag(j)=and(V_min<Total,Total<V_max);
end
SelectedData=Data(Index(Flag,:))
size(SelectedData,1)
6 件のコメント
Fangjun Jiang
2011 年 11 月 8 日
@chengxuan, you didn't fully understand the code. N is the number of data points you have. Data=rand(N,1) is the data points in my code, which should correspond to your 127x1 voltage data.
Fangjun Jiang
2011 年 11 月 8 日
@Simon: Jan, I found this problem interesting. I doubt it is a homework as a professor might give a nchoosek(10,7) but not a nchoosek(100,7) problem.
Jan
2011 年 11 月 8 日
There are no solutions with more than 16 batteries, because the mean voltage rate is above the upper limit 7.85. One of the many solutions:
V = [ ...
10.866, 11.0957, 10.9235, 10.9357, 11.121, 11.0172, 11.204, 10.7411, 11.233, ...
10.7411, 11.304, 11.43, 11.3116, 11.4252, 11.2561, 11.2136, 11.2529, 11.288, ...
11.353, 11.2135, 11.274, 11.3288, 13.0326, 13.0389, 13.021, 13.0716, 13.0399, ...
13.0177, 13.0297, 13.0366, 13.0173, 13.0006, 13.0061, 13.7852, 13.8199, ...
13.8602, 13.8133, 13.7394, 13.6925, 13.6472, 13.7704, 11.3288, 12.1985, ...
12.1771, 12.2468, 12.1428, 12.2267, 10.9119, 11.018, 9.8768, 9.9391, 10.9587, ...
10.8442, 11.002, 10.8963, 10.7968, 10.9584, 10.9174, 11.0768, 10.3929, 10.6948, ...
10.8723, 11.0513, 11.1445, 11.1231, 11.0267, 11.0055, 11.1285, 11.0819, ...
11.0273, 11.0783, 11.0346, 11.1789, 11.0578, 11.0693, 10.8994, 11.0868, ...
11.2318, 10.9432, 9.6885, 8.9472, 11.6253, 11.6791, 11.6072, 11.666, 11.746, ...
11.6042, 9.5838, 9.6636, 8.9982, 11.6352, 11.6809, 11.6358, 11.6418, 11.5941, ...
11.6731, 10.3308, 10.4433, 9.7064, 9.0447, 10.6631, 10.6594, 10.5641, 10.5523, ...
10.543, 10.6785, 10.4593, 10.5995, 10.6576, 10.7239, 10.6486, 11.1344, 11.1412, ...
11.0046, 10.7201, 11.1397, 10.7368, 11.0692, 11.1128, 10.4428, 10.9322, ...
11.0642, 11.0944, 11.1896, 11.329, 11.1137, 11.0596];
Index = [ ...
81, 71, 42, 96, 11, 45, 41, 75, 56, 118, 76, 43, 98, 93, 101, 114; ...
25, 90, 62, 14, 79, 107, 20, 103, 84, 33, 91, 126, 87, 12, 13, 27; ...
70, 80, 5, 63, 95, 69, 97, 59, 17, 94, 58, 73, 74, 106, 55, 50; ...
19, 47, 85, 120, 109, 77, 110, 2, 104, 3, 48, 57, 86, 8, 78, 102; ...
99, 116, 7, 10, 83, 122, 52, 6, 88, 49, 113, 61, 123, 54, 64, 1; ...
115, 31, 105, 112, 121, 66, 65, 51, 30, 72, 82, 60, 119, 127, 15, 68; ...
32, 92, 67, 22, 111, 53, 89, 26, 4, 100, 108, 124, 117, 18, 125, 9];
disp(sum(V(Index), 1));
Using the gunshot technique finding the solutions is even faster, if the limits 77.7 < sum(V) & sumV < 77.85 are used. Then my program needs about 0.1 to 2 seconds, average 1.1 seconds. In addition increasing the lower limit let the batteries remain longer in the wanted voltage range.
The unused batteries have a too high voltage and you can even create a pack of 6 to get the required limits. Is this allowed?
Sorry for my idea that this is a homework. The values look extremly artificial and it is a very bad idea to connect batteries with 9.1V and 13.8V in series. What will happen if the weakest battery is exhausted?! So please be careful if you apply the solution and replace weak batteries as soon as possible.
As far as I understand, one solution is enough for you. But here is the program, if anybody else is interested:
% V = [... values from above]
Idx = uint8(1):uint8(length(V));
R = zeros(7, 18, 'uint8');
iR = 0;
for i = 1:1e5 % No infinite loops
test = Shuffle(length(Idx), 'index', 7); % or "randperm(length(Idx), 7)"
sumV = sum(V(Idx(test)));
if and(77.7 < sumV, sumV < 77.85) % Actually "77.05 < sumV"
iR = iR + 1;
R(:, iR) = Idx(test);
Idx(test) = [];
% Stop early, if you have proven, that there is no 17th solution:
% if iR == 16, break; end
end
end
0 件のコメント
Jan
2011 年 11 月 8 日
@Fangjun: UNIQUE is slow. A tiny change let the program rum 11 times faster (70 sec -> 6 sec):
N = 8
K = 7;
V_min = 3.5;
V_max = 3.8;
Data = rand(N, 1);
% fid=fopen('output.csv', 'w');
Fmt = [repmat('%d,',1,K-1),'%d\n'];
for j1=1:N
for j2=1:N
for j3=1:N
for j4=1:N
for j5=1:N
for j6=1:N
for j7=1:N
% CHANGED: UNIQUE->SORT
Index = sort([j1 j2 j3 j4 j5 j6 j7]);
if all(diff(Index)) == 0
continue;
end
Total = sum(Data(Index));
if V_min < Total && Total < V_max % AND->&&
fprintf(Fmt, Index);
end
end
end
end
end
end
end
end
% fclose(fid)
But it would be fast to cut the branches early and exclude non-unique indices directly after choosing them:
for j1=1:N
for j2=1:N
if j1~=j2
for j3=1:N
if j3~=j1 && j3~=j2
for j4=1:N
if j4 ~= j1 && j4 ~= j2 && j4 ~= j3
for j5=1:N
if j5 ~= j1 && j5 ~= j2 && j5 ~= j3 && j5 ~= j4
for j6=1:N
if j6 ~= j1 && j6 ~= j2 && j6 ~= j3 && ...
j6 ~= j4 && j6 ~= j5
for j7=1:N
if j7 ~= j1 && j7 ~= j2 && j7 ~= j3 && ...
j7 ~= j4 && j7 ~= j5 && j7 ~= j6
Index = [j1 j2 j3 j4 j5 j6 j7];
Total=sum(Data(Index));
if V_min<Total && Total<V_max
% fprintf(Fmt, Index);
end
end
end
end
end
end
end
end
end
end
end
end
end
end
0.24 sec. About 300 times faster - then the full program runs for 6 month only.
[EDITED: 09Nov2011 11:07 UTC] The above solutions create duplicate answers - 5040 for N=8. This can be avoided if the indices are created in a sorted order already:
for j1=1:N
for j2=j1+1:N
for j3=j2+1:N
for j4=j3+1:N
for j5=j4+1:N
for j6=j5+1:N
for j7=j6+1:N
Index = [j1 j2 j3 j4 j5 j6 j7];
Total = sum(Data(Index));
if V_min<Total && Total<V_max
fprintf(Fmt, Index);
end
end
end
end
end
end
end
end
Runtime for N=8: 0.000865 seconds
But currently the program creates a list of all packages with the desired property. But the author does not want a single package, but as much packages as possible. Therefore the main part remains unsolved by this method: Find a maximum set of non-overlapping packages.
8 件のコメント
Jan
2011 年 11 月 17 日
@Fangjun: I've posted my gunshot code above on the 8th of Nov. "Gunshot" means, that the program is simple, free of intelligence, and even processes the inputs randomly - but it hits the mark at least near to the target. The best matching Wikipedia article is http://en.wikipedia.org/wiki/Anti-pattern . Sorry, I do not post the specific anchor - this technique is called "gunshot linking". ;-)
The 16 is a "magic number" (see anti-patterns): There ist simply no 17th soltuion, therefore the search can stop after finding 16 battery packs. You can insert a smarter break condition: If the sum of the 7 batteries with the smallest capacity is higher than the lower limit, stop the process.
There is a link to the Shuffle function on the bottom of my other answer above, so I do not think, that you have to _guess_.
Fangjun Jiang
2011 年 11 月 8 日
So this is a nchoosek() combination problem.
The theoretical solution:
Use index=nchoosek(1:100,7) to generate all the possible combinations, use index to select the batteries, sum the voltage and check against the range. The problem is that there is not enough memory to hold the index, which is a very large matrix, not even mention nchoosek(1:127,7).
The practical solution:
You don't really need all the possible solution. You just need the sufficiently enough solutions. You can generate a random index such as index=1+round(99*rand(7,1)). Use this index to select the battery and then check the sum. This is similar to Jan's solution. You can achieve the same even without shuffle() or the new randperm(N,M) function. Repeat the random selection large enough in a for-loop should give you a good result. Remember to check every time whether the index has duplicated numbers.
I think seeking the theoretical solution using MATLAB programming is still possible.
Use the following code, go through 7 levels of nested for-loop to get all the possible combinations. If the voltage sum is within range, write the battery index to a text file. Duplicated index within each combination has been taken care of. But for every combination that meets the criteria, it will be duplicated 5040 times (which is 7! or prod(1:7)). That is also the number of repeats that are executed in the code inside the for-loops. So the code is inefficient but it runs without hitting memory limits. For N=8, it takes about 100 seconds on my laptop. For N=100, it will take 100*(100^7/8^7) seconds which is 151 years! Darn! I thought it would be practical. But at least, it is not light-years. Oh, wait, light-year is not about time.
One more thing, the output .csv file contains 5040 duplicated copies for every valid selection, but that could be easily handled using Excel.
N=8;
K=7;
V_min=3.5;
V_max=3.8;
Data=rand(N,1);
fid=fopen('output.csv','wt');
for j1=1:N
for j2=1:N
for j3=1:N
for j4=1:N
for j5=1:N
for j6=1:N
for j7=1:N
Index=unique([j1 j2 j3 j4 j5 j6 j7]);
if size(Index,2)<K
continue;
end
Total=sum(Data(Index));
if and(V_min<Total,Total<V_max)
fprintf(fid,[repmat('%d,',1,K-1),'%d\n'],Index);
end
end
end
end
end
end
end
end
fclose(fid);
3 件のコメント
Jan
2011 年 11 月 9 日
Capacity is less dangerous, because I assume (hope) that the batteries are connected in parallel.
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!