We store the daily numbers of newly infected people with some virus (new infections, below) in a text fle in chronological order. The numbers are separated by colons (:). For example.when the numbers of new infections for 5 days are
(1) Find the 10th biggest number of new infections stored in the text file infections.txt and write it on the answer sheet. Count after removing duplicate numbers. For example, thethird biggest number among 1, 2, 3, 3, 4 is 2.
(2) For every text file f in the data folder, find \(N_f\),, the 10th biggest number of new infections in f, as in Question (1). Then calculate the sum of all \(N_f\), and write it on the answer sheet.
(3) For \(x_0,x_1,x_2,x_3,\dots x_{n-1}\), the numbers of new infections stored in the text fle infections.txt,the new-infection increment on a day is the diference \(x_i - x_{i-1}\) between the number of the new infections \(x_i\) on that day and \(x_{i-1}\) on the day before. Here, the number of new infections on the day before the frst day is zero.
Concatenate the new-infection increment for every day into one character sequence and store it in the text file diff.txt, Furthermore, count the characters in this sequence and write that number on the answer sheet, The newline character is not counted. Start with + for a non-negative number and start with - for a negative number.For example,when the new-infection increments are
621 -30 0 -316 214 -89
the following character sequence
+621-30+0-316+214-89
is stored and the number of characters is 20
(4) For the numbers of new infections in the text fle infections.txt, find the shortest period among the periods in which the sum of the new-infection increments is maximized. Write that period on the answer sheet. If more than one such period is found, write all the periods. The frst day is Day 1. For example, answer like “From Day 8 to 24”.Furthermore.calculate the sum of the new-infection increments during that period, and write that sum onthe answer sheet.
(1) Let \(x_0, x_1, x_2, x_3, ..., x_{n-1}\) be the numbers of new infections on every day stored in the text file infections.txt. We define the following function
Calculate the maximum and minimum values of \(ave(i)\) and write them on the answer sheet. Furthermore, calculate the sum
\[
\sum_{i=3}^{n-4} ave(i)
\]
and write it on the answer sheet. Round those values to 4 decimal places.
(2) Let \(x_0, x_1, x_2, x_3, ..., x_{m-1}\) and \(y_0, y_1, y_2, y_3, ..., y_{n-1}\) be the numbers of new infections stored in text files \(x\) and \(y\), respectively (\(m \geq n\)). We define \(s(x, y)\), the similarity score between these two files, as
Among arbitrary pairs of two files in the folder data, find the pair with the highest similarity score and write the two file names on the answer sheet. Furthermore, write the similarity score on the answer sheet. When more than one such pair is found, write all the pairs and their scores.
(3) Let \(x_0, x_1, x_2, x_3, ..., x_{n-1}\) be the numbers of new infections stored in the text file infections2.txt. These numbers are denoted by \(\{x_i\}\).
We find the approximate formula \(ai + k\) that has a good fit to these numbers \(\{x_i\}\). For example, the approximate value for \(x_3\) is \(3a + k\). Here, \(a\) and \(k\) are the constants that minimize the error
\[
\sum_{i=0}^{n-1} (ai + k - x_i)^2
\]
for \(\{x_i\}\). They are calculated as follows.
\[
a = \frac{n \sum i x_i - \sum i \sum x_i}{n \sum i^2 - (\sum i)^2}
\]
\[
k = \frac{\sum i^2 \sum x_i - \sum i x_i \sum i}{n \sum i^2 - (\sum i)^2}
\]
where \(\sum\) represents \(\sum_{i=0}^{n-1}\).
Calculate \(a\) and \(k\) rounded to 4 decimal places and write them on the answer sheet.
(4) Let \(x_0, x_1, x_2, x_3, ..., x_{n-1}\) be the numbers of new infections stored in the text file infections2.txt. For a given \(s\), a sub-sequence \(x_s, x_{s+1}, x_{s+2}, ..., x_{s+30}\) of these numbers is denoted by \(\{x_{s+i}\}\). Here, \(0 \leq s < n - 30\).
We find the approximate formula \(ka^i\) that has a good fit to a sub-sequence \(\{x_{s+i}\}\). For example, the approximate value of \(x_{s+3}\) is \(ka^3\). Here, \(a\) and \(k\) are the constants that minimize this error
for \(\{x_{s+i}\}\).
Find \(s\) such that it maximizes the value of \(a\) in the approximate formula \(ka^i\) for \(\{x_{s+i}\}\). Write the values of \(s, a, k\) for such \(s\) on the answer sheet. Round \(a\) and \(k\) to 4 decimal places. When more than one such \(s\) is found, write the values of \(s, a, k\) for every \(s\).
We currently do not have the corresponding sample data files. If you have them and are willing to share, please submit a PR.
The author create some simple samples (thus the file-related parts in the following code do not fully correspond to the problem statement but rather to the data file created by the author) and the completed code files as well as the simple samples are in this repository.
/*前言:这次的卷子也不是很难,除了最后一题有点考察数学那边最小二乘法有点思维的转换,其他的基本都是按照题意模拟即可,不过这里考察了一个遍历文件夹,也是需要准备一下相关的库函数的使用。*/#include<bits/stdc++.h>#define int long longusingnamespacestd;voidsolve(){ifstreamfin("E:/UTokyo_Entrance_Exam/CI/2022_summer/infections.txt",ios::in);ofstreamfout("E:/UTokyo_Entrance_Exam/CI/2022_summer/ans11.txt",ios::out);if(!fin.is_open())assert(0);stringstr;fin>>str;vector<int>vec;//先处理文件输入读到vec里面intnum=0;for(autoch:str){//直接暴力逐字符读取, 可能有直接用库以冒号切割的方法if(ch==':'){vec.push_back(num);num=0;}elsenum=num*10+ch-'0';}vec.push_back(num);// 找第10大的去重后的数字sort(vec.begin(),vec.end());vec.erase(unique(vec.begin(),vec.end()),vec.end());assert(vec.size()>=10);fout<<vec[vec.size()-10]<<"\n";}signedmain(){intt=1;// cin >> t;while(t--)solve();return0;}
/*按照题意简单模拟,引入了差分的概念*/#include<bits/stdc++.h>#define int long longusingnamespacestd;intcal(intnum){intres=0;while(num)num/=10,res++;returnres;}voidsolve(){ifstreamfin("E:/UTokyo_Entrance_Exam/CI/2022_summer/infections1.txt",ios::in);ofstreamfout("E:/UTokyo_Entrance_Exam/CI/2022_summer/ans13.txt",ios::out);if(!fin.is_open())assert(0);stringstr;fin>>str;vector<int>vec;// 先处理文件输入读到vec里面intnum=0;for(autoch:str){// 直接暴力逐字符读取, 可能有直接用库以冒号切割的方法if(ch==':'){vec.push_back(num);num=0;}elsenum=num*10+ch-'0';}vec.push_back(num);vector<int>diff;diff.push_back(vec[0]);for(inti=1;i<vec.size();i++){diff.push_back(vec[i]-vec[i-1]);}intcnt=0;for(autoit:diff){if(it>=0)fout<<"+"<<it;elsefout<<it;cnt+=1+cal(abs(it));}fout<<"\n"<<"the number of characters is "<<cnt<<"\n";}signedmain(){intt=1;// cin >> t;while(t--)solve();return0;}
/*这题考察的是经典dp中的最大子段和,不过还需要记录一下最大子段和的位置*/#include<bits/stdc++.h>#define int long long#define pii pair<int, int>usingnamespacestd;intcal(intnum){intres=0;while(num)num/=10,res++;returnres;}voidsolve(){ifstreamfin("E:/UTokyo_Entrance_Exam/CI/2022_summer/infections2.txt",ios::in);ofstreamfout("E:/UTokyo_Entrance_Exam/CI/2022_summer/ans14.txt",ios::out);if(!fin.is_open())assert(0);stringstr;fin>>str;vector<int>vec;// 先处理文件输入读到vec里面intnum=0;for(autoch:str){// 直接暴力逐字符读取, 可能有直接用库以冒号切割的方法if(ch==':'){vec.push_back(num);num=0;}elsenum=num*10+ch-'0';}vec.push_back(num);vector<int>diff;diff.push_back(vec[0]);for(inti=1;i<vec.size();i++){diff.push_back(vec[i]-vec[i-1]);}intmxsum=diff[0],nowsum=diff[0],mnlen=1,nowlen=1;vector<pii>res;res.push_back({1,1});for(inti=1;i<diff.size();i++){nowsum+=diff[i],nowlen++;if(nowsum>mxsum||nowsum==mxsum&&nowlen<mnlen){//更大或者相等且更短mxsum=nowsum,mnlen=nowlen;res.clear();res.push_back({i+1-nowlen+1,i+1});}elseif(nowsum==mxsum&&nowlen==mnlen){res.push_back({i+1-nowlen+1,i+1});}elseif(nowsum<=0){nowsum=nowlen=0;}}for(auto[l,r]:res)fout<<"From Day"<<l<<" to "<<r<<"\n";fout<<"the sum of that period is "<<mxsum<<"\n";}signedmain(){intt=1;// cin >> t;while(t--)solve();return0;}
/*按照题意模拟,c++要注意精度问题,由于没有数据文件,不知道题目的精度要求怎么样,保险起见用long long和long double尽量保持精度,除法也是最后再除*/#include<bits/stdc++.h>#define int long long#define db long doubleusingnamespacestd;voidsolve(){ifstreamfin("E:/UTokyo_Entrance_Exam/CI/2022_summer/infections.txt",ios::in);ofstreamfout("E:/UTokyo_Entrance_Exam/CI/2022_summer/ans21.txt",ios::out);if(!fin.is_open())assert(0);stringstr;fin>>str;vector<int>vec;// 先处理文件输入读到vec里面intnum=0;for(autoch:str){// 直接暴力逐字符读取, 可能有直接用库以冒号切割的方法if(ch==':'){vec.push_back(num);num=0;}elsenum=num*10+ch-'0';}vec.push_back(num);dbmx=-1e9,mn=1e9;intsum=0,n=vec.size();for(inti=3;i<n-3;i++){inttem=0;for(intk=-3;k<=3;k++)tem+=vec[i+k];dbave=(db)tem/7.0;mx=max(mx,ave),mn=min(mn,ave);sum+=tem;//c++有浮点误差,尽量用整数,最后再统一除7.0}fout<<fixed<<setprecision(4)<<"max ave = "<<mx<<"\n"<<"min ave = "<<mn<<"\n"<<"sum of ave = "<<(db)((db)sum/7.0)<<"\n";}signedmain(){intt=1;// cin >> t;while(t--)solve();return0;}
/*题目给出最小二乘法公式,按照题目计算一下,同样注意精度问题*/#include<bits/stdc++.h>#define int long long#define db long doubleusingnamespacestd;voidsolve(){ifstreamfin("E:/UTokyo_Entrance_Exam/CI/2022_summer/infections2.txt",ios::in);ofstreamfout("E:/UTokyo_Entrance_Exam/CI/2022_summer/ans23.txt",ios::out);if(!fin.is_open())assert(0);stringstr;fin>>str;vector<int>vec;// 先处理文件输入读到vec里面intnum=0;for(autoch:str){// 直接暴力逐字符读取, 可能有直接用库以冒号切割的方法if(ch==':'){vec.push_back(num);num=0;}elsenum=num*10+ch-'0';}vec.push_back(num);intsum_ixi=0,sum_i=0,sum_i2=0,sum_xi=0;intn=vec.size();for(inti=0;i<n-1;i++){sum_ixi+=i*vec[i];sum_i+=i;sum_xi+=vec[i];sum_i2+=i*i;}dba=((db)n*sum_ixi-sum_i*sum_xi)/(n*sum_i2-(sum_i)*(sum_i));dbk=((db)sum_i2*sum_xi-sum_ixi*sum_i)/(n*sum_i2-sum_i*sum_i);fout<<fixed<<setprecision(4)<<"a = "<<a<<", k = "<<k<<"\n";}signedmain(){intt=1;// cin >> t;while(t--)solve();return0;}
/*这题有点思维含量,要注意到合理利用第三问的公式,取对数转换为第三问的线性问题。要注意c++没有loge函数,还需要用换底公式 loge x = log2(x) / log2(e), c++中的exp(x)函数是 e ^ x, exp(1) 即为自然对数e*/#include<bits/stdc++.h>#define int long long#define db long double#define pii pair<int, int>usingnamespacestd;vector<int>readfile(stringpath){ifstreamfin(path,ios::in);if(!fin.is_open())assert(0);stringstr;fin>>str;vector<int>vec;// 先处理文件输入读到vec里面intnum=0;for(autoch:str){// 直接暴力逐字符读取, 可能有直接用库以冒号切割的方法if(ch==':'){vec.push_back(num);num=0;}elsenum=num*10+ch-'0';}vec.push_back(num);returnvec;}piicalak(vector<db>vec){// 给定 x 求线性的最小拟合的 a 和 kdbsum_ixi=0,sum_i=0,sum_i2=0,sum_xi=0;intn=vec.size();for(inti=0;i<n-1;i++){sum_ixi+=i*vec[i];sum_i+=i;sum_xi+=vec[i];sum_i2+=i*i;}dba=((db)n*sum_ixi-sum_i*sum_xi)/(n*sum_i2-(sum_i)*(sum_i));dbk=((db)sum_i2*sum_xi-sum_ixi*sum_i)/(n*sum_i2-sum_i*sum_i);return{exp(a)*10000,exp(k)*10000};}voidsolve(){ofstreamfout("E:/UTokyo_Entrance_Exam/CI/2022_summer/ans24.txt",ios::out);vector<int>vec=readfile("E:/UTokyo_Entrance_Exam/CI/2022_summer/infections3.txt");vector<array<int,3>>res;intmxa=-1e18;intn=vec.size();for(ints=0;s<n-30;s++){vector<db>x;for(inti=s;i<=s+30;i++)x.push_back(log(vec[i]+1)/log(exp(1)));auto[a,k]=calak(x);if(a>mxa)mxa=a,res.clear(),res.push_back({s,a,k});elseif(a==mxa)res.push_back({s,a,k});}for(auto[s,a,k]:res){fout<<fixed<<setprecision(4)<<"s = "<<s<<", a = "<<a/1e4<<", k = "<<k/1e4<<"\n";}}signedmain(){intt=1;// cin >> t;while(t--)solve();return0;}