Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
D
dify
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
ai-tech
dify
Commits
3e6cafbe
Commit
3e6cafbe
authored
Jun 28, 2023
by
jyong
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'fix/fix-special-code' into deploy/dev
parents
442f1dae
8552c6e5
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
2 deletions
+5
-2
indexing_runner.py
api/core/indexing_runner.py
+5
-2
No files found.
api/core/indexing_runner.py
View file @
3e6cafbe
...
...
@@ -235,7 +235,8 @@ class IndexingRunner:
if
len
(
preview_texts
)
<
5
:
preview_texts
.
append
(
document
.
page_content
)
tokens
+=
TokenCalculator
.
get_num_tokens
(
self
.
embedding_model_name
,
document
.
page_content
)
tokens
+=
TokenCalculator
.
get_num_tokens
(
self
.
embedding_model_name
,
self
.
filter_string
(
document
.
page_content
))
return
{
"total_segments"
:
total_segments
,
...
...
@@ -345,6 +346,8 @@ class IndexingRunner:
return
text_docs
def
filter_string
(
self
,
text
):
text
=
text
.
replace
(
'<|'
,
'<'
)
text
=
text
.
replace
(
'|>'
,
'>'
)
pattern
=
re
.
compile
(
'[
\x00
-
\x08\x0B\x0C\x0E
-
\x1F\x7F\x80
-
\xFF
]'
)
return
pattern
.
sub
(
''
,
text
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment