Manipulating a text file without using symbolic engine

Hello friends!
I have extremely long algebraic expressions saved as text files. I need to asign 0 to some variables in the text file and
find a way to simplify it. Of course I can use symbolic engine but I do not dare to do this simply beacause it takes ages
to do so. So, I wish to do this special simplification by text manipulation only. To make it clear, let's go for a simple example as bellow:
str='-(dt^(1/2)*(24*D0+dt*(6*D2+dt*(D4+4*D0*D1^2+4*D0^2*D2+4*D0*D3+6*D1*D2)+12*D0*D1)))/24';
Now, if D2=0 then str should be simplified to
str='-(dt^(1/2)*(24*D0+dt*(dt*(D4+4*D0*D1^2+4*D0*D3)+12*D0*D1)))/24';
Any idea?
Your help is greatly apreciated!
Babak

11 件のコメント

Rik
Rik 2021 年 12 月 23 日
These manipulations will get very complex very soon, and with complexity tends to come a performance hit.
You might be able to split your equation into separate terms by spliting on parentheses and plus and minus. A term can be removed if you see 'D2*' or '*D2'. However, that removal can only happen on the lowest level of nesting.
This would be a Herculean task, and it isn't clear at the onset whether the performance will actually be acceptable.
Mohammad Shojaei Arani
Mohammad Shojaei Arani 2021 年 12 月 24 日
Thanks Rik,
I wish I would have never learned matlab. It is not free and has lots of problems. It is really crazy. Python is free and much better in many ways. I started a project 3 months ago and unfortunately used matlab. Once I finish it I breakup with matlab forever.
Rik
Rik 2021 年 12 月 24 日
Python will not help you in this case, unless there is an equivalent package to do exactly what you need with a better performance than what Matlab will do for you.
There are many pros and cons for Matlab, and for Python the list is long as well. The problem in this thread would not be substantially easier with Python. I suspect many manipulations and splits will have to happen with regular expressions, which have equivalent performance as far as I'm aware.
Which program you use is obviously up to you, but I don't think this issue is a fair criticism. An "extremelylong algebraic expression" can be expected to take a lot of time.
Walter Roberson
Walter Roberson 2021 年 12 月 24 日
What are your requirements? If D2 = 1 what results would you expect from each of these?
D2*5 + 2
D0*D2 + D0*3
4*D0/D2
4*D0/(D2+1)
D0*D1 + D0*D1*D2
3*D2 + D0*D1 + 5
Mohammad Shojaei Arani
Mohammad Shojaei Arani 2021 年 12 月 24 日
Hi Walter,
In my case, D2=0. If D2=1 or ony non-zero number then the problem is easy. The challenge is related to the case D2=0 since in this case I need to do a lot of string manipulation. For instance, in oyur 5-th expression D0*D1 + D0*D1*D2 if D2=1 then I can use the command 'replace' to replace *D2 with '' to get D0*D1 + D0*D1. But, in the case D2=0 the problem is not that simple and a lot of text manipulation is needed. Consider the expression
1+D0*D1*D2(D0*D1 + D0*D1*D2^3-3D5^7)-D6*D3+5;
again it is very easy to tackle the case D2=1 but it is not that easy to tackle the case D2=0.
Mohammad Shojaei Arani
Mohammad Shojaei Arani 2021 年 12 月 24 日
Hi Rik,
Well, perhaps it is not correct to compare mtlab and pythos as you mentioned. But, matlab has lots of strange problems. For instance, a particularly anoying problem, which Ireally suffered, is that it does not have garbage collector. As a result, when I use the command 'clear' I do not get back the lost memory completely and this leads to loss of memory. The only way to fix this problem is to shut down matlab and turn it on again. 2 weeks ago I had to run a very time conssuming for-loop and decided to do it at night when I am sleaping. The next morning, I noticed that my there was no memory left and my for-loop actually stopped working after the 5-th iteration. So, I had to run my for-loop for a few operations, and then shut matlab down and continue again.
Isn't this a big shame for a computer languge which is also expensive to buy?
Walter Roberson
Walter Roberson 2021 年 12 月 24 日
There is a garbage collector. It does not always work perfectly, but absolutely definitely there is a garbage collector.
Walter Roberson
Walter Roberson 2021 年 12 月 24 日
You did not answer my questions about what the expected results would be in the cases I showed.
If you are only willing to answer questions about D2=0 then what is the expected result of
D0*(D1 + D2) + D0*D1
D0/D2
3*(D2 + 1) + 4
Mohammad Shojaei Arani
Mohammad Shojaei Arani 2021 年 12 月 25 日
Humm! Do not understand the point in your question. But, the answer to your question seems straghtforwards to me as bellow:
D0*(D1 + D2) + D0*D1 = D0*(D1+0)
D0/D2 = D0/0
3*(D2 + 1) + 4 = 3*(0+1)
Of course, the symbolic engine can make better simplifications but I would still be happy with the above less desirable simplifications, too. Why? Because my actual algebraic expressions, saved as text files, are super large in which I am not able to use the symbolic engine.
DGM
DGM 2021 年 12 月 25 日
編集済み: DGM 2021 年 12 月 25 日
That's not simplification. That's merely substitution, and it's not what your initial example describes.
If substitution is sufficient, that can be done with strrep()
tosub = 'D2';
orig = '-(dt^(1/2)*(24*D0+dt*(6*D2+dt*(D4+4*D0*D1^2+4*D0^2*D2+4*D0*D3+6*D1*D2)+12*D0*D1)))/24';
example = '-(dt^(1/2)*(24*D0+dt*(dt*(D4+4*D0*D1^2+4*D0*D3)+12*D0*D1)))/24';
subsonly = strrep(orig,tosub,'0')
subsonly = '-(dt^(1/2)*(24*D0+dt*(6*0+dt*(D4+4*D0*D1^2+4*D0^2*0+4*D0*D3+6*D1*0)+12*D0*D1)))/24'
but it's not clear that such a minor change actually saves any significant amount of time, as no simplification occurs.
syms dt D0 D1 D2 D3 D4
orig = '-(dt^(1/2)*(24*D0+dt*(6*D2+dt*(D4+4*D0*D1^2+4*D0^2*D2+4*D0*D3+6*D1*D2)+12*D0*D1)))/24';
orig = [orig '+' orig '+' orig '+' orig '+' orig]; % make it longer
orig = [orig '*' orig '*' orig '*' orig '*' orig];
numel(orig)
ans = 2149
timeit(@() symbolicmethod(orig,D2))
ans = 2.7823
timeit(@() stringmethod(orig))
ans = 2.6660
function symbolicmethod(orig,D2)
origsym = str2sym(orig); % convert to sym
symsubs = subs(origsym,D2,0); % substitute and simplify
end
function stringmethod(orig)
strsubs = strrep(orig,'D2','0'); % string substitution with no simplification
strsubs = str2sym(strsubs); % convert to sym and simplify
end
The two results symsubs and strsubs are identical, but the speed advantage of case 2 in this example is pretty small. It's small enough that it's occasionally slower than case 1. Can it be made faster? I don't know. Is it better with particular expressions? I don't know.
Walter Roberson
Walter Roberson 2021 年 12 月 25 日
str='-(dt^(1/2)*(24*D0+dt*(6*D2+dt*(D4+4*D0*D1^2+4*D0^2*D2+4*D0*D3+6*D1*D2)+12*D0*D1)))/24';
Now, if D2=0 then str should be simplified to
str='-(dt^(1/2)*(24*D0+dt*(dt*(D4+4*D0*D1^2+4*D0*D3)+12*D0*D1)))/24';
Matching parts, we see that the simplified string has transformed +6*D1*D2 to nothing, and 6*D2+ to nothing, and +D0^2*D2 to nothing . So the D2 has to be recognized in the leading term of a summation, and in the trailing term of a summation, and terms that include an addend that has a multiplication by D2 are to vanish, at least in the case where the D2 is the final multiplication in the term (we do not have an example to go by to tell whether the term needs to vanish if D2 appears anywhere else in the term.)
D0*(D1 + D2) + D0*D1 = D0*(D1+0)
D0/D2 = D0/0
3*(D2 + 1) + 4 = 3*(0+1)
According to the first of those, D2 is to become 0 but the 0 is to stay in the addition. What is different about this case compared to the earlier cases where the term vanished? Well in this case, D2 is not being multiplied by anything. So we deduce that terms should only vanish in an addend if the component that is becoming 0 is the final factor in the multiplication.
According to the second of those, 0 as a pure denominator does not need to vanish. I did not thing to ask about the case of D0 + D2/5 where the 0 would be a pure numerator . I also did not think to ask about D0 + D1*D2/5 or D0 + D2*D1/5
According to the third of those, the D2 is to become 0 but the 0 is to stay in the addition. What is different about this case compared to the earlier cases where the term vanished? Well in this case, D2 is not being multiplied by anything. So we deduce that terms should only vanish in an addend if the component that is becoming 0 is the final factor in the multiplication.
But... in the third of those, we also see that the + 4 is intended to vanish. The reason for that is not obvious at all. I cannot come up with any rule about that.
It would be a lot easier on us if you were to give us a list of replacement rules instead of requiring us to give examples and you tell us what result you want for the example.
If I had not asked about those examples, I would have had no way of knowing that you want the 0+ or +0 to stay when the variable appears by itself in an addend.

サインインしてコメントする。

 採用された回答

Walter Roberson
Walter Roberson 2021 年 12 月 25 日

0 投票

Sorry, I can't be bothered to preserve spaces. Also, without authorization, I went ahead and surpressed multiplication by 1 (but not division by 1)
str = ["-(dt^(1/2)*(24*D0+dt*(6*D2+dt*(D4+4*D0*D1^2+4*D0^2*D2+4*D0*D3+6*D1*D2)+12*D0*D1)))/24"
"-(dt^(1/2)*(24*D0-dt*(6*D2-dt*(D4-4*D0*D1^2-4*D0^2*D2-4*D0*D3-6*D1*D2)-12*D0*D1)))/24"
"D0*(D1 + D2) + D0*D1"
"D0/D2 - D1*D0 + D1*12 - 12*D1 + 13*D0 - D0*13"
"3*(D2 + 1) + 4"]
str = 5×1 string array
"-(dt^(1/2)*(24*D0+dt*(6*D2+dt*(D4+4*D0*D1^2+4*D0^2*D2+4*D0*D3+6*D1*D2)+12*D0*D1)))/24" "-(dt^(1/2)*(24*D0-dt*(6*D2-dt*(D4-4*D0*D1^2-4*D0^2*D2-4*D0*D3-6*D1*D2)-12*D0*D1)))/24" "D0*(D1 + D2) + D0*D1" "D0/D2 - D1*D0 + D1*12 - 12*D1 + 13*D0 - D0*13" "3*(D2 + 1) + 4"
strs = regexprep(str, '\s', '');
str0 = regexprep(strs, {'\<D2\>', '\<D0\>'}, {'0', '1'})
str0 = 5×1 string array
"-(dt^(1/2)*(24*1+dt*(6*0+dt*(D4+4*1*D1^2+4*1^2*0+4*1*D3+6*D1*0)+12*1*D1)))/24" "-(dt^(1/2)*(24*1-dt*(6*0-dt*(D4-4*1*D1^2-4*1^2*0-4*1*D3-6*D1*0)-12*1*D1)))/24" "1*(D1+0)+1*D1" "1/0-D1*1+D1*12-12*D1+13*1-1*13" "3*(0+1)+4"
strnz = regexprep(str0, {'[+-]\<[a-zA-Z0-9\*\^]+\*0', '\<[a-zA-Z0-9\*\^]+\*0\+', '\<[a-zA-Z0-9\*\^]+\*0\-'}, {'', '', '-'})
strnz = 5×1 string array
"-(dt^(1/2)*(24*1+dt*(dt*(D4+4*1*D1^2+4*1*D3)+12*1*D1)))/24" "-(dt^(1/2)*(24*1-dt*(-dt*(D4-4*1*D1^2-4*1*D3)-12*1*D1)))/24" "1*(D1+0)+1*D1" "1/0-D1*1+D1*12-12*D1+13*1-1*13" "3*(0+1)+4"
strn1 = regexprep(strnz, {'\*1\>', '\<1\*'}, {'', ''})
strn1 = 5×1 string array
"-(dt^(1/2)*(24+dt*(dt*(D4+4*D1^2+4*D3)+12*D1)))/24" "-(dt^(1/2)*(24-dt*(-dt*(D4-4*D1^2-4*D3)-12*D1)))/24" "(D1+0)+D1" "1/0-D1+D1*12-12*D1+13-13" "3*(0+1)+4"

4 件のコメント

Mohammad Shojaei Arani
Mohammad Shojaei Arani 2021 年 12 月 26 日
Quite a lot of nice ideas!
Thanks a lot for your precious time!
Walter Roberson
Walter Roberson 2021 年 12 月 26 日
I just realized that the code does not properly handle ^0
Mohammad Shojaei Arani
Mohammad Shojaei Arani 2021 年 12 月 27 日
Thanks Walter for your precious time. I will have to find a way for this.
Walter Roberson
Walter Roberson 2021 年 12 月 27 日
Do you have situations where you have ^ with the power being a variable ?
... Because if you do not, then I am not going to bother to write up a solution.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

ヘルプ センター および File ExchangeCreating and Concatenating Matrices についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by