Text Fragment Rendering

Text fragment rendering displays specific portions of document text (defined by character ranges) with all inline embeds resolved to their actual content.

Overview

When embedding a document fragment with a range (e.g., hm://account/doc#blockId[10:50]), the system needs to:

Extract characters 10-50 from the block's text

Resolve any inline embeds within that range

Display the resulting text with embed names

The Challenge

Fragment ranges are specified using unicode code point positions in the original text, which:

Counts actual text characters

Excludes invisible inline embed markers (U+FEFF)

Uses unicode code points (not UTF-16 code units)

But we want to display text where:

Inline embeds are replaced with document names

Character positions map correctly to the original range

Solution: FragmentText Component

Component Location

frontend/packages/ui/src/document-content.tsx

How It Works

function FragmentText({
  documentId,
  blockRef,
  start,
  end,
}: {
  documentId: UnpackedHypermediaId
  blockRef: string
  start: number
  end: number
})

Process:

Fetch Full Text: Call getDocumentText with the blockRef and resolveInlineEmbeds: true

getDocumentText(
  {...documentId, blockRef, blockRange: null},
  {lineBreaks: false, resolveInlineEmbeds: true}
)

Extract Fragment: Use Array.from() to properly handle unicode code points

const codePoints = Array.from(fullText)
const fragment = codePoints.slice(start, end).join('')

Display: Render the extracted text

<Text className="whitespace-pre-wrap">{fragment}</Text>

Integration with ContentEmbed

The ContentEmbed component detects text fragments and renders them appropriately:

// Check if this is a text fragment (blockRef with start/end range)
const isTextFragment =
  props.blockRef &&
  props.blockRange &&
  'start' in props.blockRange &&
  'end' in props.blockRange

if (isTextFragment && props.blockRef && props.blockRange &&
    'start' in props.blockRange && 'end' in props.blockRange) {
  // Render as plain text with resolved embeds
  content = (
    <FragmentText
      documentId={narrowHmId(props)}
      blockRef={props.blockRef}
      start={props.blockRange.start}
      end={props.blockRange.end}
    />
  )
} else {
  // Normal block rendering
  // ...
}

Example Scenarios

Example 1: Simple Text Fragment

Original block text:

"Hello world, this is a test paragraph with some content."

Fragment: #blockId[0:11]

Result: "Hello world"

Example 2: Fragment with Inline Embed

Original block text (with invisible markers):

"Check out \uFEFF post about AI!"
// Position: 0-9, [10 = embed], 11-25

With inline embed resolved:

"Check out @Alice's Guide post about AI!"

Fragment: #blockId[0:20]

Result: "Check out @Alice's Guide pos" (first 20 unicode code points)

Example 3: Multiple Inline Embeds

Original text:

"Read \uFEFF and \uFEFF for more info"
// [Read ][embed1][ and ][embed2][ for more info]

With embeds resolved:

"Read @Getting Started and @Advanced Topics for more info"

Fragment: #blockId[0:25]

Result: First 25 code points with both embed names included

Character Position Mapping

Key Concepts

Original Positions: Defined in the blockRange, count actual text excluding embed markers

Resolved Text: After documentToText processes it, embeds become their document names

Unicode Code Points: Use Array.from() to properly count multi-byte characters

Why Array.from()?

JavaScript strings are UTF-16 encoded. Emojis and special characters may use multiple UTF-16 code units:

// Wrong: UTF-16 code units
"Hello 👋".length // 7 (emoji uses 2 code units)

// Correct: Unicode code points
Array.from("Hello 👋").length // 6 (emoji is 1 code point)

API Endpoint Support

Fragment rendering works seamlessly in both desktop and web:

Desktop

Direct grpcClient access

Synchronous document fetching

Immediate text resolution

Web

Server-side API: /hm/api/document-text

Accepts blockRef in URL parameters

Returns resolved text via JSON

Component States

Loading

if (loading) {
  return (
    <div className="flex items-center justify-center p-2">
      <Spinner />
    </div>
  )
}

Error

if (error) {
  return <ErrorBlock message={`Failed to load fragment: ${error}`} />
}

Success

return (
  <Text className="whitespace-pre-wrap">
    {text}
  </Text>
)

Usage in Embeds

When a user creates an embed with a text range:

Editor: User selects text range in a block

Link Creation: System creates link like hm://account/doc#blockId[10:50]

Rendering:

ContentEmbed detects the range

FragmentText fetches and extracts text

Displays resolved fragment

Performance Considerations

Caching

Consider caching getDocumentText results

Fragment extraction is fast (O(n) where n = text length)

Network requests may be slow on web

Optimization

Only fetch when fragment changes

UseEffect dependencies include all ID components

Loading state prevents UI jank

Dependencies

useEffect(() => {
  // ...
}, [
  getDocumentText,
  documentId.uid,
  documentId.path?.join('/'),
  documentId.version,
  blockRef,
  start,
  end
])

Testing

Test scenarios to verify:

Basic fragment extraction: Simple text without embeds

Single inline embed: Fragment includes embed name

Multiple inline embeds: All embeds resolved correctly

Unicode handling: Emojis and special characters

Boundary cases: start=0, end=text.length

Error handling: Missing blocks, network errors

Related Files

frontend/packages/ui/src/document-content.tsx - FragmentText component

frontend/packages/shared/src/document-to-text.ts - Text resolution

frontend/packages/shared/src/document-content-types.ts - Type definitions

frontend/apps/web/app/routes/hm.api.document-text.tsx - Web API