1727 words
9 minutes
Solution Report of String(Middle) Topic
2026-04-29

P12923 [POI 2021/2022 R3] 模板 2 / Szablon 2#

Useful Link
Problem Statement
Reference Blog

You can found that if a chain of expanding just contribute length less than half, they all can be replaced by the last one.

Then we can using ExKMP to calculate the LCP of origin string and adverse string, knowing that which place should not place string.

The last part we can greedy, the strategy is find the most back position which can compare a origin string or adverse string, it is easy. But there I know a new way to maintain this position which is using DSU instead SegTree.

At first each node point to themselves. Then if note xx become invalid, let fa[x] = find(x-1). So the root of each DSU is the most back position which valid.

P11291 【MX-S6-T3】「KDOI-11」简单的字符串问题 2#

Useful Link
Problem Statement
Reference Blog

P10716 【MX-X1-T4】「KDOI-05」简单的字符串问题#

Useful Link
Problem Statement
Reference Blog

We can prove that a string A is valid if and only if:

  • A is a border of S[1..i];
  • A appears at least k times in S[1..i] without overlap.

After building the failure tree, we can observe that all valid A form a chain from some node u to the root, and u is an ancestor of i. Therefore, once we know u, the answer is simply dep[u].

We can find u using binary lifting.
This is equivalent to checking the following condition:

Whether the position of the k-th non-overlapping occurrence of S[1..u] is ≤ i.

So we preprocess, for every prefix S[1..u], the positions where this string appears in the whole string without overlap. This is equivalent to repeatedly finding the first suffix after the current position whose LCP with S is at least u, and then jumping to that suffix.

The total number of jumps is at most:

i=1nni=O(nlogn)\sum_{i=1}^{n} \frac{n}{i} = O(n \log n)

so we can just simulate the jumps directly.

To compute the LCP between a suffix and S, we can use the Z-function.
Then we apply the DSU trick to maintain a linked list: enumerate u from small to large, and after finishing u, delete all positions p with z[p] = u. This guarantees that when processing u, the DSU representative of a position is exactly the next position with z[p] \ge u.

In this way, we can find in O(\alpha(n)) the first position p after a given index such that z[p] ≥ u.

Note that the case k = 1 needs to be handled separately.

Thus the total time complexity is:

O(qlogn+nα(n)logn)O(q \log n + n \alpha(n) \log n)

P9482 [NOI2023] 字符串#

Useful Link
Problem Statement
Reference Blog

P6125 [JSOI2009] 有趣的游戏#

Useful Link
Problem Statement
Reference Blog

It’s not hard to think of using ACAM to solve this problem. But the difficlut point is if the state denote probability when through a node on ACAM, but ACAM have cycle so we cannot use it.

Consider below things: We stop when across a mark point on ACAM, so we can let state to calculate expectation. Because we stop when meet first mark point, so we can promise the sum of expectation of each node equal to 11. Expectation now is equal to probability.

Left part is easy, just use gaussian elimination solve it.

P5466 [PKUSC2018] 神仙的游戏#

Useful Link
Problem Statement
Reference Blog

We found that if a string ss have a border length lenlen, it will have a cyclic section with length nlenn-len. So wildcard become meanless because we can calculate them by cyclic section.

To be more, if ii and jj are equal modulo nlenn-len, sis_i and sjs_j are must equal, or lenlen is impossible.

So we can define two array:

  1. a[i]a[i], it will equal 11 only when s[i]s[i] is equal 11.
  2. b[i]b[i], it will equal 11 only when s[i]s[i] is equal 00.

Then we can use

C[d]=i=1nA[i]×B[i+d]C[d] = \sum_{i=1}^n A[i]\times B[i+d]

to count there are different number pair in length dd.

left part is easy, just check a length lenlen is valid or not then add it contribution to answer.

P4569 [BJWC2011] 禁忌#

Useful Link
Problem Statement
Reference Blog

The optimal strategy is greedy.

Suppose we have already matched a taboo string ending at position i.

  • If we cut immediately, we gain +1, and restart.
  • If we delay, this segment may overlap with future matches, possibly reducing total count.

Thus: Cut immediately whenever a taboo string appears is optimal.

We build an AC Automaton over all taboo strings.

Each state represents:

  • current suffix match
  • whether we have matched a taboo string (is_taboo)

And do is_taboo[u] |= is_taboo[fail[u]]. So any suffix match is correctly detected.

Let:

  • state denotes current node in AC automaton
  • plus one extra dimension: accumulated expectation

So total dimension is states + 1

Since the string is random, each character is chosen with probability 1 / alphabet.

Enumerate (u, c), let v = next(u, c).

If v is not taboo, just transfer u → v.
If v is taboo, we greedily cut here, so u → root and contribute +1.

To handle expectation, add one extra dimension ans_col.
When hitting a taboo, do trans[u][ans_col] += 1 / alphabet and let trans[ans_col][ans_col] = 1 so the answer accumulates.

Now the whole process becomes a linear transfer:

ft+1=ft×transf_{t+1} = f_t × trans

We need exactly length len, so compute trans^len

Starting from root, the answer is (trans^len)[root][ans_col].
Time complexity is O(S3loglen)O(S^3 log len) where SS is number of AC states.

P4081 [USACO17DEC] Standing Out from the Herd P#

Useful Link
Problem Statement
Reference Blog

There is common trick when SA is solving a problem with lot of string. Let these string end to end, and using a character which not in character set to connect them.

Let the initial answer equal to len×(len+1)2\frac{len\times(len+1)}{2}.

For each suffix xx, the excluded answer is max(lcp(x,i),lcp(x,j),lcp(x,k))\max(\operatorname{lcp}(x,i),\operatorname{lcp}(x,j),\operatorname{lcp}(x,k)), where ii is the suffix with the largest rank satisfying rk[i]<rk[x],id[i]id[x]rk[i]<rk[x], id[i]\neq id[x], jj is the suffix with the smallest rank satisfying rk[j]>rk[x],id[j]id[x]rk[j]>rk[x], id[j]\neq id[x], and kk is the suffix with the largest rank satisfying id[k]=id[x]id[k]=id[x].

Then we can find hight array in SA is equal to excluded part. So just minus them.

P4465 [国家集训队] JZPSTR#

Useful Link
Problem Statement
Reference Blog

It said that std of this problem is Unrolled Linked List + SAM, but most solution in Luogu using bitset to solve it and get a great score.

Because the character set is little just 10, we let std::bitset<N> b[x] denote character xx is at pospos if b[x][pos] = 1. Then we can maintain each oeprate easily.

  1. right move origin string length of yiy_i from xix_i, then put yiy_i in the blank.
  2. same with above, left move origin string length of yixi+1y_i - x_i + 1.

Then how to match a string become the main problem. Repeat ans = (ans << 1) & b[z[j]-'0'] then the number of 11 is answer, draw the process then you can understand, this algorithm called Shift-and.

P4045 [JSOI2009] 密码#

Useful Link
Problem Statement
Reference Blog

Classic problem, do compression DP on ACAM.

Let dp[i][j][msk] denotes when we contract to place ii and we are on the node jj ACAM and the usage state of preparing string is mskmsk. Then transition is easy.

dp[i][j][msk]dp[i+1][trj,l][mskstatetrj,l]dp[i][j][msk] \rightarrow dp[i+1][tr_{j,l}][msk|state_{tr_{j,l}}]

trj,ltr_{j,l} denotes the son of node jj which is meaning letter ll.

The answer is sum of dp[n][j][fullmask]dp[n][j][fullmask].

P3823 [NOI2017] 蚯蚓排队#

Useful Link
Problem Statement
Reference Blog

Becasue kk is very little, we can maintain answer of each kk when hashtable is merging and splitting.

We can maintain this queue by list, and answer only related to kk. For example, if we are merging two queue, just enumerate the length ll and enumerate start point jj then count all string that two endpoints are on both sides of the splice.

The time complexity is O(nk+ck2+s)O(nk + ck^2 + \sum |s|), cc is the number of second operation. Because each operate will just produce k2k^2 hash value at most.

P1117 [NOI2016] 优秀的拆分#

Useful Link
Problem Statement
Reference Blog

P3234 [HNOI2014] 抄卡组#

Useful Link
Problem Statement
Reference Blog

Analyze in 3 cases.

  1. All string have no *, compare them with hash.
  2. All string have *, check substring from 0 to first * is prefix of next string, and the substring from last * to end is suffix of next string. Remember sort with length.
  3. Both two case. First, check same as case 1. Then we can use hash check beginning, middle, ending part of this string.

P3715 [BJOI2017] 魔法咒语#

Useful Link
Problem Statement
Reference Blog

Also a classic problem, similar as P4045 but this problem should split into two part to solve it.

First for small LL, do compress DP on ACAM, the state dp[len][j]dp[len][j] denotes we already get new magic which length is lenlen and we are at node jj on ACAM. Transition is easy. Enumerate all string if there are no bad word, we can transition.

Then try think how to solve this problem when L<=108L <= 10^8, first we can know that we must use matrix multiple to solve it when LL become such so big. So we have below idea:

[FtFt+1]=Matrix×[Ft1Ft]\left[\begin{matrix} F_t \\ F_{t+1} \end{matrix}\right] = \operatorname{Matrix} \times \left[\begin{matrix} F_{t-1} \\ F_{t} \end{matrix}\right]

Then their is a common trick when construct matrix: if ii can transition to jj, mat[i][j]++.
Same as first case, transition is easy, then we can get transition matrix.

P3245 [HNOI2016] 大数#

Useful Link
Problem Statement
Reference Blog

We using sum[i]sum[i] denotes the value of S[in]S[i\dots n].

Also we can analyze into 22 cases: PP equal or not equal to 22 and 55.

If is not:

 sum[L]sum[R+1]10nRmodP=0 (sum[L]sum[R+1])modP=0 sum[L]sum[R+1](modP)\begin{aligned} &\ \frac{sum[L] - sum[R+1]}{10^{n-R}} \bmod P = 0\\ \Rightarrow &\ (sum[L] - sum[R+1]) \bmod P = 0 \\ \Rightarrow &\ sum[L] \equiv sum[R+1] \pmod{P} \\ \end{aligned}

Then problem turn to count how many same number in sum[LR+1]sum[L\dots R+1], it a classic problem of Mo’s algorithm.

If is:

It become easier: if the end of a number can be divided by 22 or 55 then whole number can be divided by it, so answer increase ii if a[i]modP=0a[i] \bmod P = 0.

P3546 [POI 2012] PRE-Prefixuffix#

Useful Link
Problem Statement
Reference Blog

Because of the length-LL prefix and suffix can be represented as XY and YX, if we using kk denote the cut position between XX and YY. Then we have : s[1k]=s[nk+1n],s[k+1L]=s[nL+1nk]s[1\cdots k]=s[n-k+1\cdots n],s[k+1\cdots L]=s[n-L+1\cdots n-k]. This means that problem wants to us calculate two borders.

then is a interesting transition: turn origin string into form s1,sn,s2,sn1,s_1,s_n,s_2,s_{n-1},\cdots, two borders problem let we calculate become two palindromes on top. So using manacher can solve this problem easily.

P3454 [POI 2007] OSI-Axes of Symmetry#

Useful Link
Problem Statement
Reference Blog

If a shape is symmetrical, we can get a palindrome if cut from a endpoint of axis of symmetry.

So use cross product instead of angle, the square of an edge’s length instead of length of edge. Then manacher to count how many palindrome can cover a length nn substring in the double-string. double-string means we paste a copy of origin string at the end of origin string.

Solution Report of String(Middle) Topic
https://blog.517group.cn/posts/202604042043/
Author
XianRuiDendro
Published at
2026-04-29
License
CC BY-NC-SA 4.0