This repository has been archived on 2026-05-14. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
MinecraftConsoles/Minecraft.Client/Windows64/Iggy/gdraw/gdraw_d3d1x_shared.inl
MrTheShy 88798b501d Split screen, widescreen support, font rendering fixes, ui scaling fixes (#767)
* Sync keyboard text buffer from Flash before processing physical input

The native keyboard scene maintained a separate C++ buffer
(m_win64TextBuffer) for physical keyboard input, which was pushed
to the Flash text field via setLabel(). However, when the user typed
with the on-screen controller buttons, Flash updated its text field
directly through ActionScript without updating the C++ buffer.

This caused a desync: switching back to the physical keyboard would
overwrite any text entered via controller, since m_win64TextBuffer
still held the old value before the controller edits.

Fix: read the current Flash text field into m_win64TextBuffer at the
start of each tick(), before consuming new physical keyboard chars.
This ensures both input methods always operate on the same state.

* Use last active input device to decide keyboard mode instead of connection state

The keyboard UI mode (on-screen virtual keyboard vs direct text input)
was determined by Win64_IsControllerConnected(), which checks if any
XInput controller is physically plugged in. This meant that even if
the player was actively using mouse and keyboard, the virtual keyboard
would still appear as long as a controller was connected.

Replace the connection check with g_KBMInput.IsKBMActive(), which
tracks the actual last-used input device based on per-frame input
detection. Now the keyboard mode is determined by what the player is
currently using, not what hardware happens to be plugged in.

Affected scenes: CreateWorldMenu (world naming) and LoadOrJoinMenu
(world renaming).

* Fix TextInput caret behavior and add proper cursor editing for KBM direct edit

The direct text editing mode introduced for KBM users had several
issues with the TextInput control's caret (blinking cursor) and text
manipulation:

1. Caret visible when not editing:
   When navigating to the world name field with keyboard/mouse, Flash's
   Iggy focus system would show the blinking caret even though the field
   wasn't active for editing yet (Enter not pressed). This was misleading
   since typing had no effect in that state.

   Fix: access the FJ_TextInput's internal m_mcCaret MovieClip and
   force its visibility based on editing state. This is enforced every
   tick because setLabel() and Flash focus transitions continuously
   reset the caret state.

2. No cursor movement during editing:
   The direct edit implementation treated the text as a simple buffer
   with push_back/pop_back — there was no concept of cursor position.
   Backspace only deleted from the end, and arrow keys did nothing.

   Fix: track cursor position (m_iCursorPos) in C++ and use wstring
   insert/erase at that position. Arrow keys (Left/Right), Home, End,
   and Delete now work as expected. The visual caret position is synced
   to Flash via the FJ_TextInput's SetCaretIndex method.

3. setLabel() resetting caret position:
   Every call to setLabel() (when text changes) caused Flash to reset
   the caret to the end of the string, making the cursor jump visually
   even though the C++ position was correct.

   Fix: enforce caret position via setCaretIndex every tick during
   editing, so any Flash-side resets are immediately corrected.

New UIControl_TextInput API:
- setCaretVisible(bool): toggles m_mcCaret.visible in Flash
- setCaretIndex(int): calls FJ_TextInput.SetCaretIndex in Flash

* Fix keyboard/arrow navigation not working when no UI element is focused

On Windows64 with KBM, moving the mouse over empty space (outside any
button) would clear the Iggy focus entirely. After that, pressing arrow
keys did nothing because Flash had no starting element to navigate from.

Two changes here:

- Don't set focus to IGGY_FOCUS_NULL when the mouse hovers over empty
  space. The previous hover target stays focused, so switching back to
  arrows keeps working seamlessly.

- When a navigation key is pressed and nothing is focused at all (e.g.
  mouse was already on empty space when the menu opened), grab the first
  focusable element instead of silently dropping the input. The keypress
  is consumed to avoid jumping two elements at once.

This makes mixed mouse+keyboard navigation feel a lot more natural.
You can point at a button, then continue with arrows, or just start
pressing arrows right away without having to hover first.

* Overhaul mouse support and generalize direct text editing to all UI scenes

This is a large rework of the Windows64 KBM (keyboard+mouse) input layer.
It touches the mouse hover system, the mouse click dispatch, and the direct
text editing infrastructure, then applies all of it to every scene that has
text input fields or non-standard clickable elements.

MOUSE HOVER REWRITE (UIController.cpp tickInput)

The old hover code had two structural problems:

(a) Scene lookup was group-first: it iterated UI groups and checked all
layers within each group. The Tooltips layer on eUIGroup_Fullscreen (which
holds non-interactive overlays like button hints) would be found before
in-game menus on eUIGroup_Player1. The tooltip scene focusable objects
captured mouse input and prevented hover from reaching the actual menu.

Fixed by switching to layer-first lookup across all groups, and skipping
eUILayer_Tooltips entirely since those are never interactive.

(b) On tabbed menus (LaunchMoreOptionsMenu Game vs World tabs), all
controls from all tabs are registered in Flash at the same time. There was
no filtering, so controls from inactive tabs had phantom hitboxes that
overlapped the active tab controls, making certain buttons unhoverable.

Fixed by introducing parent panel tracking: each UIControl now has a
m_pParentPanel pointer, set automatically by the UI_MAP_ELEMENT macro
during mapElementsAndNames(). The hover code checks the control parent
panel against the scene GetMainPanel() and skips mismatches. This is the
same technique the Vita touch code used, but applied to mouse hover.

The coordinate conversion was also simplified. The old code had two separate
scaling paths (window dimensions for hover, display dimensions for sliders).
Now there is one conversion from window pixel coords to SWF coords using
the scene own render dimensions.

REUSING VITA TOUCH APIs FOR MOUSE (ButtonList, UIScene)

Several APIs originally gated behind __PSVITA__ are now enabled for Win64:

- UIControl_ButtonList::SetTouchFocus(x,y) and CanTouchTrigger(x,y): the
  Flash-side ActionScript methods were already registered on all platforms
  in setupControl(), only the C++ wrappers were ifdef-gated. Opening the
  ifdefs to include _WINDOWS64 lets the mouse hover code delegate to Flash
  for list item highlighting, which handles internal scrolling and item
  layout that would be impractical to replicate in C++.

- UIScene::SetFocusToElement(id): programmatic focus-by-control-ID, used as
  a fallback when Iggy focusable objects do not match the C++ hit test.

- UIScene_LaunchMoreOptionsMenu::GetMainPanel(): returns the active tab
  panel control, needed by the hover code to filter inactive tab controls.

MOUSE CLICK DISPATCH (UIScene.cpp handleMouseClick)

Left-clicking previously relied entirely on Iggy ACTION_MENU_OK dispatch,
which routes to whatever Flash considers focused. This broke for custom-
drawn elements that are not Flash buttons (crafting recipe slots), and for
scenes where Iggy focus did not match what the user visually clicked.

Added a virtual handleMouseClick(x, y) on UIScene with a default
implementation that hit-tests C++ controls. When multiple controls report
overlapping bounds (common in debug scenes where TextInputs report full
Flash-width), it picks the one whose left edge X is closest to the click.
Returns true to consume the click and suppress the normal ACTION_MENU_A
dispatch via a m_mouseClickConsumedByScene flag on UIController.

The default implementation handles buttons, text inputs, and checkboxes
(toggling state and calling handleCheckboxToggled directly).

CRAFTING MENU MOUSE CLICK (UIScene_CraftingMenu.cpp)

The crafting menu recipe slots (H slots) are rendered through Iggy custom
draw callback, not as Flash buttons. They have no focusable objects, so
mouse clicking did nothing.

The solution caches SWF-space positions during rendering: inside customDraw,
when H slot 0 and H slot 1 are drawn, the code extracts SWF coordinates
from the D3D11 transform matrix via gdraw_D3D11_CalculateCustomDraw_4J.
The X difference between slot 0 and slot 1 gives the uniform slot spacing.

handleMouseClick then uses these cached bounds to determine which recipe
slot was clicked, resets the vertical slot indices (same pattern as the
constructor), updates the highlight and vertical slots display, and re-shows
the old slot icon. This mirrors the existing controller LEFT/RIGHT
navigation in the base class handleKeyDown.

DIRECT EDIT REFACTORING (UIControl_TextInput)

The direct text editing feature (type directly into text fields instead of
opening the virtual keyboard) was originally implemented inline in
CreateWorldMenu with all the state, character consumption, cursor tracking,
caret visibility, and cooldown logic hardcoded in one scene.

Moved everything into UIControl_TextInput:
- beginDirectEdit(charLimit): captures current label, inits cursor at end
- tickDirectEdit(): consumes chars, handles Backspace/Enter/Escape, arrow
  keys (Left/Right/Home/End/Delete), enforces caret visibility every tick
  (because setLabel and Flash focus transitions continuously reset it),
  returns Confirmed/Cancelled/Continue
- cancelDirectEdit() / confirmDirectEdit(): programmatic control
- isDirectEditing() / getDirectEditCooldown() / getEditBuffer(): state query

For SWFs that lack the m_mcCaret MovieClip child (like AnvilMenu), the
existence check validates by reading a property from the resolved path,
since IggyValuePathMakeNameRef always succeeds even for undefined refs.
When no caret exists, the control inserts a _ character at the cursor
position as a visual fallback.

The caret check result is cached in m_bHasCaret/m_bCaretChecked to avoid
repeated Iggy calls that could corrupt internal state.

SCENES UPDATED WITH DIRECT EDIT + VIRTUAL KEYBOARD

Every scene with text input now supports both input modes: direct editing
when KBM is active, virtual keyboard (via NavigateToScene eUIScene_Keyboard)
when using a controller. The mode is chosen at press time based on
g_KBMInput.IsKBMActive().

- CreateWorldMenu: refactored to use the new UIControl_TextInput API,
  removing ~80 lines of inline editing code.

- AnvilMenu: item renaming now supports direct edit. The keyboard callback
  uses Win64_GetKeyboardText instead of InputManager.GetText (which reads
  from a different buffer on Win64). The virtual keyboard is opened with
  eUILayer_Fullscreen + eUIGroup_Fullscreen so it does not hide the anvil
  container menu underneath. Added null guards on getMovie() in setCostLabel
  and showCross since the AnvilMenu SWF may not fully load on Win64.

- SignEntryMenu: all 4 sign lines support direct edit. Clicking a different
  line while editing confirms the current one. Each line cooldown timer
  is checked independently to prevent Enter from re-opening the edit.

- LaunchMoreOptionsMenu: seed field direct edit with proper input blocking.

- DebugCreateSchematic: all 7 text inputs (name + start/end XYZ coords).
  handleMouseClick is overridden to always consume clicks during edit to
  prevent Iggy re-entry on empty space.

- DebugSetCamera: all 5 inputs (camera XYZ + Y rotation + elevation).
  Clicking a different field while editing confirms the current value and
  opens the new one. Float display formatting changed from %f to %.2f.

All keyboard completion callbacks on Win64 now use Win64_GetKeyboardText
(two params: buffer + size) instead of InputManager.GetText, which reads
from the correct g_Win64KeyboardResult global when using the in-game
keyboard scene.

SCROLL WHEEL

Mouse wheel events (ACTION_MENU_OTHER_STICK_UP/DOWN) are now centrally
remapped to ACTION_MENU_UP/DOWN in UIController::handleKeyPress when KBM
is active. Previously each scene would need to handle OTHER_STICK actions
separately, and most did not, so scroll wheel only worked in a few places.

* Add mouse click support to CraftingMenu (tab switching, slot selection, craft)

The crafting screen's horizontal recipe slots and category tabs are custom-drawn
via Iggy callbacks rather than regular Flash buttons, so the standard mouse hover
system can't interact with them. This adds handleMouseClick to derive clickable
regions from the H slot positions cached during customDraw.

Tab clicking: tab hitboxes are computed relative to the H slot row since the
Vita TouchPanel overlays (full-screen invisible rectangles) aren't suitable
for direct hit-testing on Win64. The Y bounds were tuned empirically to match
the SWF tab icon positions. Clicking a tab runs the same switch logic as
LB/RB: hide old highlight, update group index, reset slot indices,
recalculate recipes, and refresh the display.

H slot clicking: clicking a different recipe slot selects it (updating V slots,
highlight, and re-showing the previous slot). Clicking the already-selected
slot crafts the item by dispatching ACTION_MENU_A through handleKeyDown,
reusing the existing crafting path. Empty slots (iCount == 0) are ignored.

All mouse clicks on the scene are consumed (return true) to prevent misses
from falling through as ACTION_MENU_A and accidentally triggering a craft.
This only suppresses mouse-originated A presses via m_mouseClickConsumedByScene;
keyboard and controller A remain fully functional.

Also enables GetMainPanel for Win64 (was Vita-only) so the mouse hover system
can filter controls by active panel, same as other tabbed menus.

* Fix mouse hover selecting wrong buttons from the third onward

The hover code was doing a redundant second hit-test against Iggy
focusable object bounds after the C++ control bounds had already
identified the correct control. Iggy focusable bounds are wider than
the actual visible buttons and overlap vertically, so the "pick
largest x0" heuristic would match focusables belonging to earlier
buttons when hovering the right side of buttons 3+.

Replaced the IggyPlayerGetFocusableObjects path with a direct
SetFocusToElement call using the already-correct hitControlId from
the C++ hit-test, same approach the click path uses in
handleMouseClick. Also switched the overlap tiebreaker from "largest
x0" to smallest area, consistent with how clicks resolve overlapping
controls. TextInput is excluded from hover focus to avoid showing
the caret on mere mouse-over (its Iggy focus is set on click).

* Use smallest-area tiebreaker for mouse click hit-testing too

Same overlap fix applied to handleMouseClick: when multiple controls
contain the click point, prefer the one with the smallest bounding
area instead of the one with the largest left-edge X. This is more
robust for any layout (vertical menus, grids, overlapping panels)
and matches the hover path logic.

Those changes were initially made in order to fix the teleport ui for the mouse but broke every other well working ui.

* Fix mouse cursor staying trapped in window on alt-tab

When the inventory or other UI with a hidden cursor was open,
alt-tabbing out would leave the cursor locked to the game window.
SetWindowFocused(false) from WM_KILLFOCUS correctly released the
clip and showed the cursor, but Tick() was unconditionally calling
SetCursorPos every frame to re-center it, overriding the release.

Added m_windowFocused to the Tick() condition so cursor manipulation
only happens while the window actually has focus.

* Map mouse right click to ACTION_MENU_X for inventory half-stack

Right clicking an item stack in Java Edition picks up half of it.
Console Edition already handles this via ACTION_MENU_X (the X button
on controller), which sets buttonNum=1 in handleKeyDown. This maps
mouse right click to that same action so KBM players get the same
behavior across all container menus (inventory, chests, furnaces,
hoppers, etc).

* Fix mouse hover hitting removed controls (ghost hitboxes)

When removeControl() removes a Flash element (e.g. the Reinstall
button in Help & Options, or the Debug button when disabled), the
C++ control object stays in the m_controls vector. On Vita this was
handled by calling setHidden(true) and checking getHidden() in the
touch hit-test, but on Windows64 none of that was happening.

The result: removed buttons kept phantom bounds that the hover code
would match against, stealing focus from the buttons that shifted
into their visual position. In the Help & Options menu with debug
enabled, the removed Reinstall button (Button6) had ghost bounds
overlapping where the Debug button (Button7) moved to after the
removal, making Debug un-hoverable and snapping focus to Button1.

The fix has three parts:

- removeControl() now calls setHidden(true) on all platforms, not
  just Vita. The m_bHidden member was already declared on all
  platforms, only the accessors were ifdef'd behind __PSVITA__.

- Removed the __PSVITA__ ifdef from setHidden/getHidden in
  UIControl.h so they're available everywhere.

- Added getHidden() checks in both the hover and click hit-test
  loops, matching what the Vita touch code already does. The check
  is a simple bool read (no Flash/Iggy call), placed before the
  getVisible() query which hits Flash and can return stale values
  for removed elements.

* Add right-click to open save options in world selection menu

On controller, RB (ACTION_MENU_RIGHT_SCROLL) opens the save options
dialog (rename/delete) when a save is selected. Mouse right-click
maps to ACTION_MENU_X, which had no Windows64 handler in this scene.

Added save options handling under ACTION_MENU_X for _WINDOWS64 so
right-clicking a save opens the same dialog. Also handles the mashup
world hide action for right-click consistency. Console-only options
(copy save, save transfer) are excluded since they don't apply here.

* Fix splitscreen mouse, keyboard cursor, and local player join

Mouse hover and click in split-screen was broken: the coordinate
conversion from window pixels to Flash/SWF space did not account for
the viewport tile-origin offset or the smaller display dimensions of
each splitscreen quadrant. Now the mouse position is mapped through
three steps: window pixels to UIController screen space, subtract the
viewport origin (which varies per quadrant/split type), then scale
from display size to SWF authoring size. This fixes hover highlighting
and click targeting in all splitscreen layouts.

Mouse input was also bleeding into other splitscreen players' UI groups
because the scene lookup iterated all groups. Now it only checks the
fullscreen group and the primary (KBM) player's group, so controller
players' menus are never affected by mouse movement.

Mouse grab/release (cursor lock for gameplay) was triggering for every
local player's tick, causing fights between splitscreen players over
the cursor state. Now only the primary pad player controls grab state.

The in-game keyboard scene in PC mode had no cursor movement: typing
always appended at the end and backspace always deleted from the end.
Added a cursor position tracker (m_iCursorPos) so that characters are
inserted at the cursor, backspace deletes behind it, and arrow keys,
Home, End, and Delete all work as expected. The Flash caret is synced
to the cursor position each tick. Also stopped syncing the text buffer
back from Flash in PC mode, which was resetting the cursor every tick.
Arrow keys in PC mode no longer get forwarded to Flash (which would
move the on-screen keyboard selector instead of the text cursor).

AddLocalPlayerByUserIndex was calling NotifyPlayerJoined before the
IQNet slot was actually registered, passing a pointer obtained via
GetLocalPlayerByUserIndex which checks customData (not set yet at that
point). Now AddLocalPlayerByUserIndex is called first, and if it
succeeds, the notification uses the static m_player array directly.
The stub AddLocalPlayerByUserIndex now properly initialises the slot
with gamertag and remote/host flags instead of being a no-op.

IsSignedIn was hardcoded to return true only for pad 0, preventing
splitscreen players from joining. Now it checks IsPadConnected so any
connected controller can sign in.

GetXUID returned INVALID_XUID for all pads except 0, which broke
splitscreen player identity. Now each pad gets a unique XUID derived
from the base value plus the pad index.

Pinned internal resolution to 1920x1080 and removed GetSystemMetrics
auto-detection which was picking up the native monitor resolution and
breaking the 16:9 assumption in the viewport math and Flash layout.
DPI awareness is kept for consistent pixel coordinates.

* Fix Escape key not opening pause menu during tutorial hints

The KBM pause check had a IsTutorialVisible guard that blocked
Escape entirely while any tutorial popup was on screen. The
controller path never had this restriction. Removed the check
so Escape behaves the same as Start on controller.

* Fix crash in WriteHeader when save buffer is too small for header table

When a player enters a new region, RegionFile's constructor calls
createFile which adds a FileEntry with length 0 to the file table.
This increases the header table size (appended at the end of the save
buffer) by sizeof(FileEntrySaveData) per entry, but since no actual
data is written to the file, MoveDataBeyond is never called and the
committed virtual memory pages are never grown to match.

On the next autosave tick, saveLevelData writes level.dat first
(before chunkSource->save which would have grown the buffer). If
level.dat doesn't need to grow, finalizeWrite calls WriteHeader which
tries to memcpy the now-larger header table past the end of committed
memory, causing an access violation.

This is especially likely in splitscreen where two players exploring
at the same time can create multiple new RegionFile entries within a
single tick, quickly exhausting the page-alignment slack in the buffer
(yes i am working at splitscreen in the meanwhile :) )

The fix was deduced by tracing the crash callstack through the save
system: FileHeader, ConsoleSaveFileOriginal, the stream chain, and
the RegionFile/RegionFileCache layer. The root cause turned out to be
a gap between createFile (which grows the header table) and
MoveDataBeyond (the only place that grows the buffer), with
finalizeWrite sitting right in between unprotected.

The buffer growth check added here mirrors the exact same VirtualAlloc
pattern already used in MoveDataBeyond (line 484-497) and in the
constructor's decompression path (line 176-190), so it integrates
naturally with the existing code. Same types, same page rounding,
same error handling. The fast path (no new entries, buffer already big
enough) is a single DWORD comparison that doesn't get taken, so there
is zero overhead in the common case.

This is the right place for the fix because finalizeWrite is the sole
caller of WriteHeader, meaning every code path that writes the header
(closeHandle, PrepareForWrite, deleteFile, Flush) is now protected by
a single check point.

* Fix TextInput bugs and refactor direct edit handling into UIScene base class

The fake cursor character (_) used for SWFs without m_mcCaret was leaking
into saved sign and anvil text. This happened because setLabel() with
instant=false only updates the C++-side cache, deferring the Flash write
to the next control tick. Any getLabel() call before that tick reads the
old Flash value still containing the underscore. Fixed by passing
instant=true in confirmDirectEdit, cancelDirectEdit, and the Enter key
path inside tickDirectEdit, so the cleaned text hits Flash immediately.

Mouse hover over TextInput controls (world name, anvil name, seed field)
was not showing the yellow highlight border. The hover code used
IggyPlayerSetFocusRS which sets Iggy's internal dispatch focus but does
not trigger Flash's ChangeState callback, so no visual feedback appeared.
Buttons worked fine because Iggy draws its own focus ring on them, but
TextInput relies entirely on ChangeState(0) for the yellow border.
Switched to SetFocusToElement which goes through the Flash-side SetFocus
path, then immediately call setCaretVisible(false) to suppress the
blinking caret that comes with focus. No visual flicker since rendering
happens after both tickInput and scene tick complete.

While direct editing, mouse hover was able to move focus away to other
TextInputs on the same scene (most noticeably on the sign editor, where
hovering a different line would steal focus from the line being typed).
Added an isDirectEditBlocking() check in the hover path to skip focus
changes when any input on the scene is actively being edited.

The Done button in SignEntryMenu was unresponsive to mouse clicks during
direct editing. The root cause is execution order: handleMouseClick runs
before handleInput in the frame. The base handleMouseClick found the Done
button and called handlePress, but handlePress bailed out because of the
isDirectEditing guard. The click was marked consumed, so handleInput
never saw it. Fixed by overriding handleMouseClick in SignEntryMenu to
detect the Done button hit while editing and confirm + close directly.

Added click-outside-to-deselect for anvil and world name text inputs.
Both scenes previously required Enter to confirm the edit, which felt
wrong. Now clicking anywhere outside the text field bounds confirms the
current text, matching standard UI behavior.

The anvil menu now updates the item name in real time while typing, like
Java edition. Previously the name was only applied on Enter, so the
repair cost display was stale until confirmation.

The biggest change is structural: every scene that used direct editing
(AnvilMenu, CreateWorldMenu, SignEntryMenu, LaunchMoreOptionsMenu,
DebugCreateSchematic, DebugSetCamera) had its own copy of the same
boilerplate -- tickDirectEdit loops in tick(), click-outside hit testing
in handleMouseClick(), cooldown guard checks in handleInput/handlePress,
and result dispatch with switch/if chains. This was around 200 lines of
near-identical code scattered across 6 files, each with its own slight
variations and its own bugs waiting to happen.

Pulled all of it into UIScene with two virtual methods: getDirectEditInputs()
where scenes register their text inputs, and onDirectEditFinished() where
they handle confirmed/cancelled results. The base class tick() drives
tickDirectEdit on all registered inputs, handleMouseClick() does the
click-outside-to-deselect hit test generically using panel offsets, and
isDirectEditBlocking() replaces all the inline cooldown checks. Scenes
now just override those two methods and get everything for free.

Also removed the m_activeDirectEditControl enum tracking from the debug
scenes (DebugCreateSchematic, DebugSetCamera) since the base class
handles lifecycle tracking through the controls themselves.

* Remap scroll wheel to LEFT/RIGHT for horizontal controls

The scroll wheel was always remapped to UP/DOWN, which is fine for
vertical lists but useless on horizontal controls like sliders and
the texture pack selector.

Track whether the mouse is hovering a horizontal control during the
hover hit-test (new bool m_bMouseHoverHorizontalList, set for
eTexturePackList and eSlider). When the flag is set, handleKeyPress
emits LEFT/RIGHT instead of UP/DOWN for wheel events.

TexturePackList is also now part of the mouse hover system with
proper hit-testing, relative-coord SetTouchFocus and GetRealHeight
for accurate bounds.

* Guard setCaretVisible and setCaretIndex against null movie

tickDirectEdit calls into Iggy every tick without checking if the
movie is still valid, which crashes inside iggy_w64.dll when the
Flash movie gets unloaded or isn't ready yet.

* Fix creative inventory scroll for both mouse wheel and controller

The mouse scroll wheel was not working in the creative inventory at
all. UIController remaps wheel input from OTHER_STICK to UP/DOWN for
KBM users, but the base container menu handler consumed UP/DOWN for
grid navigation before it could reach the creative menu's page
scrolling logic in handleAdditionalKeyPress. Fixed by detecting
scroll wheel input on UP/DOWN in the base handler and forwarding it
as OTHER_STICK to handleAdditionalKeyPress instead.

Also fixed the controller right stick scrolling way too fast: it was
jumping TabSpec::rows (5) rows per tick at 100ms repeat rate, which
blew through the entire item list almost instantly. Reduced to 1 row
per tick so scrolling feels controlled on both input methods.

* Fix split-screen world rendering aspect ratio

gluPerspective was hardcoded to use g_iAspectRatio (always 16:9)
instead of the aspect parameter from getFovAndAspect, which adjusts
for split-screen viewports. The 3D world was horizontally stretched
in top/bottom split because the projection used 16:9 while the
viewport was 32:9.

* Split-screen UI system with full ultrawide and multi-aspect-ratio support

Screen resolution is now auto-detected from the monitor at startup
instead of being hardcoded to 1920x1080. This fixes rendering on
ultrawide (21:9), super-ultrawide (32:9), 16:10, and any other
aspect ratio -- both in singleplayer and split-screen multiplayer.

The 3D world renders at native resolution so the full monitor is used.
Flash UI is 16:9-fitted and centered inside each viewport, pillarboxed
on wide displays and letterboxed on tall ones. Logical game dimensions
(used for ortho projection and HUD layout) are computed proportionally
from the real screen aspect ratio, fixing the stretched world projection
and HUD that the old hardcoded 1280x720 caused on non-16:9 monitors.

GameRenderer::ComputeViewportForPlayer uses the actual backbuffer size
instead of the logical game size, which was causing split-screen
viewports to be sized incorrectly.

UIScene::render fits menus to 16:9 within each split viewport using
GetViewportRect + Fit16x9, keeping inventory/crafting/options screens
at their designed aspect ratio instead of stretching.

Panorama and MenuBackground render at full viewport size with proper
tile scaling so the background fills the entire area without gaps in
vertical split and quadrant layouts.

HUD tile rendering uses ComputeTileScale to uniformly scale the SWF
and show the bottom portion (hotbar, hearts, hunger) in horizontal
and quadrant splits. repositionHud passes visible SWF-space dimensions
to ActionScript for proper element centering within each viewport.

Chat and Tooltips overlays use ComputeTileScale and
ComputeSplitContentOffset to anchor correctly to the bottom of each
player's viewport tile.

Container menus apply Fit16x9 to pointer coordinate mapping so the
cursor tracks correctly in split-screen. getMouseToSWFScale moved out
of the header into the .cpp. Mouse input in onMouseTick is gated to
pad 0 since raw mouse deltas should only drive player 1.

All shared viewport math lives in UISplitScreenHelpers.h:
- GetViewportRect: origin and dimensions for any viewport type
- Fit16x9: aspect-correct fitting with centering offsets
- ComputeTileScale: uniform scale and Y-offset for tile rendering
- ComputeSplitContentOffset: content centering for overlay components

* Fix XUID assignment for split-screen local players

Main's XUID refactor returned INVALID_XUID for pad != 0, which breaks
split-screen because each local player needs a distinct identity for
the save system and per-player inventory data.

Now pad 1-3 get unique XUIDs derived from the legacy embedded base
(base + iPad), same as the original console behavior. Only pad 0
uses the persistent uid.dat-backed XUID for networking.

* Use persistent XUID for all pads in GetXUID

All pads now get unique XUIDs derived from the persistent uid.dat value
(base + iPad offset). This gives each split-screen player a globally
unique identity that works for both local play and online multiplayer.

The host legacy XUID override for save compatibility still happens in
Minecraft.cpp after GetXUID is called, so old worlds are unaffected.

* Split-screen networking, window resize, bitmap font fix, and multiplayer stability

Adds the networking layer for non-host split-screen multiplayer, implements
live window resize with swap chain recreation, fixes bitmap font scaling at
small window sizes, and fixes several crash-causing bugs in the multiplayer
stack (compression buffer overflow, TCP stream desync, chunk visibility race,
CompressedTileStorage torn reads, reconnect stability).

== Non-host split-screen multiplayer ==

Each split-screen pad on a non-host client opens its own TCP connection to
the host. From the host's perspective each connection looks like a normal
remote player (gets its own smallId, Socket, PlayerConnection).

WinsockNetLayer: JoinSplitScreen(), CloseSplitScreenConnection(),
SplitScreenRecvThreadProc, per-pad socket/thread/smallId tracking
(s_splitScreenSocket[], s_splitScreenSmallId[], s_splitScreenRecvThread[]).
GetLocalSocket() returns the correct TCP socket for a given local sender's
smallId. GetSplitScreenSmallId() returns the host-assigned smallId for a pad.

GameNetworkManager::CreateSocket: non-host path (localPlayer && !IsHost() &&
IsInGameplay()) calls JoinSplitScreen, sets the IQNet slot's smallId and
resolvedXuid, creates a non-hostLocal Socket + ClientConnection, sends
PreLoginPacket, registers via addPendingLocalConnection.

PlatformNetworkManagerStub::RemoveLocalPlayerByUserIndex: implemented the
formerly-empty stub. Calls NotifyPlayerLeaving, CloseSplitScreenConnection,
and clears the IQNet slot fields so the pad can rejoin cleanly.

SmallId pool: s_nextSmallId starts at XUSER_MAX_COUNT (4), reserving
m_player[0-3] for local pads so remote players never collide.

IQNetPlayer::SendData: non-host local senders now route through
GetLocalSocket(m_smallId) instead of always using SendToSmallId.
IQNet::GetLocalPlayerByUserIndex: rewritten. Pad 0 on non-host uses
GetLocalSmallId() for direct lookup; pads 1-3 check m_player[padIdx].
C_4JProfile::IsSignedIn: pad 0 always returns true (was checking controller
connection, which is unreliable on Win64).

GetGamertag/GetDisplayName: for pads 1-3 with active local players, returns
the pad-specific gamertag from IQNet::m_player instead of always returning
the primary username.

ClientConnection: isPrimaryConnection() (true on host or for the primary pad
on non-host) guards relative-delta and world-modifying handlers to prevent
double-processing of shared state:
- Guarded: handleMoveEntity, handleMoveEntitySmall, handleChunkTilesUpdate,
  handleBlockRegionUpdate, handleTileUpdate, handleTakeItemEntity,
  handleSignUpdate, handleTileEntityData, handleTileEvent,
  handleTileDestruction, handleComplexItemData, handleLevelEvent,
  handleSoundEvent, handleParticleEvent, handleAddGlobalEntity.
- handleSetEntityMotion: secondary connections only accept motion targeting
  their own local player (knockback).
- handleExplosion: world modification (finalizeExplosion) guarded,
  per-player knockback unguarded. Added null check on localplayers[].
- Entity spawn/remove/teleport/data handlers left unguarded (putEntity is
  idempotent, absolute value setters).

handleLogin: added else clause to set level when the dimension already exists
(was leaving level NULL on reconnect).
handleChunkVisibilityArea/handleChunkVisibility: added null check on level.
handleContainerOpen: added null check on localplayers[m_userIndex].

== Reconnect stability ==

PendingConnection: duplicate XUID no longer rejects with eDisconnect_Banned.
Instead it force-disconnects the stale old connection via
stalePlayer->connection->disconnect(), queues the old smallId for recycling
via queueSmallIdForRecycle(), then calls handleAcceptedLogin for the new
connection.

MinecraftServer: swapped tick order so players->tick() (disconnect queue)
runs before connection->tick() (new logins). The old player is removed
from PlayerList before the new LoginPacket's XUID check runs.

PlayerList: PushFreeSmallId and ClearSocketForSmallId moved here from
DoWork, called only after PlayerConnection::disconnect() completes and
the read/write threads are dead. New queueSmallIdForRecycle() method lets
PendingConnection push smallIds into m_smallIdsToClose, which PlayerList::tick()
processes through closePlayerConnectionBySmallId() for deferred cleanup.
Prevents a race where the old write thread could resolve getPlayer() to a
recycled smallId's new connection and send stale packets on it.

SocketInputStreamLocal::close() and SocketOutputStreamLocal::close() now
actually clear their queues (std::swap with empty queue instead of calling
.empty() which is a read-only no-op).

ServerConnection::stop(): pending and players vectors are snapshot-copied
before iterating (prevents iterator invalidation). Remote players receive
a DisconnectPacket via disconnect(eDisconnect_Quitting) instead of raw
close(). tick(): added else clause so flush() only runs on live connections.

WinsockNetLayer::Shutdown(): accept thread stopped first (prevents new recv
threads from spawning), then all recv threads are collected and waited on,
then connections are closed and split-screen sockets cleaned up. Clears
disconnect and free-pool vectors before deleting critical sections.

WinsockNetLayer::JoinGame(): waits for old s_clientRecvThread to fully
exit before creating a new TCP connection. Prevents the old recv thread
from reading bytes off the new socket and desynchronizing the stream.

== Compression buffer overflow ==

CompressLZXRLE and CompressRLE wrote RLE intermediate output into a fixed
100KB buffer with no bounds checking. Full chunk columns are ~160KB and
the RLE step can expand 0xFF bytes to 2 bytes each, easily overflowing
into rleDecompressBuf and heap metadata. This caused delayed crashes in
unrelated code (Packet::readPacket, LevelRenderer::updateDirtyChunks) after
the first autosave, since that's when full chunks get compressed.

Fix: dynamic allocation when worst-case RLE output (SrcSize * 2) exceeds
the static buffer. Static buffer still used for small inputs (zero overhead).
CompressRLE: moved LeaveCriticalSection after dynamic buffer cleanup.

DecompressLZXRLE: now checks zlib return value (was completely ignored).
On failure, bails out immediately with *pDestSize = 0. Added RLE input
bounds checking (pucIn >= pucEnd before reading count/data bytes) and
output bounds checking (pucOut + count > pucOutEnd). Same bounds checks
applied to DecompressRLE.

== Stream desync (Connection write thread) ==

The write thread had two output paths to the same TCP socket: bufferedDos
(5KB buffered stream) and direct sos->writeWithFlags(). Chunk data sent
via queueSend() used the direct path with shouldDelay=true, while other
packets used bufferedDos. If bufferedDos had unflushed bytes, the direct
write arrived at the client first, reordering the TCP stream and producing
bad packet ID crashes.

Fix: flush bufferedDos immediately before every direct sos->writeWithFlags().

== Chunk visibility race (empty first chunk after 30s) ==

BlockRegionUpdatePacket (direct socket write via queueSend) could arrive
at the client before ChunkVisibilityAreaPacket (buffered). The client
called getChunk() on a chunk that didn't exist yet in the cache, got
EmptyLevelChunk (whose setBlocksAndData is a no-op), and silently lost
the block data. On superflat this left one invisible chunk; on normal
worlds it crashed the renderer.

Fix: handleBlockRegionUpdate calls dimensionLevel->setChunkVisible() for
full-chunk BRUPs before writing data, making it independent of packet
ordering. Added post-write verification logging.

CompressedTileStorage race: get() reads indicesAndData twice without a
lock. compress() can swap the pointer between reads, producing indices
from the old buffer paired with data from the new buffer. Fix: snapshot
indicesAndData into a local variable before deriving both pointers. Same
snapshot pattern applied to getData() (non-Vita path), isRenderChunkEmpty(),
getHighestNonEmptyY(), getAllocatedSize(), and write(). All methods now
also guard against NULL snapshots.

== Window resize ==

ResizeD3D() destroys the old swap chain, creates a new one at the target
size, then patches InternalRenderManager members directly via memory
offsets (0x20=swap chain, 0x28=RTV, 0x50=SRV, 0x98=DSV, 0x5138/0x513C=
backbuffer width/height). Offset verification cross-checks known pointers
(device at 0x10, swap chain at 0x20) before patching. Old RTV/SRV are
intentionally leaked (orphaned with the old swap chain) to avoid fighting
unknown ref holders in the precompiled RenderManager.

The flow: Suspend RenderManager, ClearState+Flush, release views,
gdraw_D3D11_PreReset, destroy old swap chain, create new swap chain via
IDXGIFactory, patch offsets, recreate RTV/SRV/DSV, rebind render targets,
update UIController (updateRenderTargets + updateScreenSize),
gdraw_D3D11_PostReset + SetRendertargetSize, IggyFlushInstalledFonts,
Resume, PostProcesser::Init.

WM_SIZE handling defers resize during window drag (WM_ENTERSIZEMOVE/
WM_EXITSIZEMOVE). Immediate resizes (maximize, programmatic) call
ResizeD3D directly. Removed the old UpdateAspectRatio() function.

CleanupDevice() was leaking g_pDepthStencilView and g_pDepthStencilBuffer.

InitDevice: swap chain BufferUsage now includes DXGI_USAGE_SHADER_INPUT
(needed for the SRV created from the backbuffer for CaptureThumbnail).

New globals: g_rScreenWidth/g_rScreenHeight (real window dimensions,
updated on resize) vs g_iScreenWidth/g_iScreenHeight (fixed logical
resolution, stays 1920x1080).

ComputeViewportForPlayer and getFovAndAspect now use g_rScreenWidth/
g_rScreenHeight instead of the fixed startup values, so 3D perspective
and split-screen viewports adapt to window size.

Main loop: rendering skipped when window is minimized (IsIconic check)
to avoid 100% GPU usage on a hidden swap chain.

Windows64_UIController: new updateRenderTargets(rtv, dsv) method updates
cached D3D pointers used by gdraw_D3D11_SetTileOrigin every frame.
UIController.h: new inline updateScreenSize(w, h) sets m_fScreenWidth/
m_fScreenHeight so all downstream UI code picks up the new size.

== Bitmap font scaling ==

At small window sizes, dynamic text (scrollable list items, HowToPlay
pages) showed overlapping characters. Static SWF text was unaffected
because it uses embedded vector glyphs.

Root cause in UIBitmapFont.cpp GetGlyphBitmap: when display scale is
smaller than the bitmap's native scale (pixel_scale < truePixelScale,
glyphScale stays at 1), Iggy displayed the glyph at native 1:1 pixel
size but advanced the cursor by the smaller display-scale amount.

At intermediate window sizes (e.g. 1678x756, scale factor ~0.7), a
second bug appeared: some SWF font sizes produced pixel_scale just above
truePixelScale (13 for Mojangles_11) while others fell just below,
splitting glyphs across the small-display and normal cache branches.
The normal branch cached all glyphs in a single [truePixelScale, 99]
range, so the first glyph cached set pixel_scale_correct for every
subsequent request regardless of font size. Different font sizes then
got scaled by wrong ratios (e.g. 18.9/13.3 = 1.42x with point sampling),
producing visibly inconsistent letter sizes. This only happened at
specific window sizes where the display scale put some fonts above and
others below the truePixelScale boundary. Full 1080p and very small
windows were unaffected because all fonts landed in the same branch.

Fix: on _WINDOWS64, always use pixel_scale_correct = truePixelScale so
every cache entry is consistent regardless of which font size creates it
first. Two cache ranges: downscale (pixel_scale < truePixelScale) uses
bilinear for smooth reduction, upscale uses point_sample for crisp
pixel-art rendering. At most two cache entries per glyph. The console
code path (fixed resolution, integer-multiple scaling) is preserved
behind #else.

UIScene.cpp loadMovie: always load 1080.swf on _WINDOWS64 regardless of
window size. The old height-based selection could pick 480 or 720 variants
which either crashed or loaded the wrong skin library (skinHD.swf vs
skin.swf). Display size is now set via Fit16x9 BEFORE the init tick so
Iggy's ActionScript text field creation sees the same scale that render()
will use. IggyFlushInstalledFonts() called after init tick to clear stale
glyph cache entries from previous scenes.

Font.cpp addCharacterQuad/renderCharacter: yOff was computed with
m_charWidth instead of m_charHeight, producing wrong texture coordinates
for non-square glyph cells. This is the world-rendering font (chat, signs,
name tags), not the Iggy UI font.

== XUID generation ==

Split-screen pad XUIDs derived by hashing baseXuid + iPad through Mix64
(DeriveXuidForPad in Windows64_Xuid.h) instead of simple addition. Pad 0
returns the base XUID unchanged for save compatibility. Includes validity
fallbacks if the hash produces an invalid XUID. (Suggested by rtm516)

== Misc ==

Packet::readPacket: thread-local ring buffer tracks last 8 good packet IDs.
On bad packet ID, dumps the history plus next 32 bytes of stream for
diagnosing TCP desynchronization.

PendingConnection/PlayerList: debug logging for the reconnect flow
(duplicate XUID handling, force-disconnect, handleAcceptedLogin,
placeNewPlayer with smallId/entityId/dimension).

ClientConnection::handleBlockRegionUpdate: warning log when a full chunk
arrives with ys==0 (empty full chunk, data loss indicator).

== Known issues / future work ==

SendOnSocket global lock (WinsockNetLayer.cpp): s_sendLock is a single
CriticalSection serializing ALL TCP sends across ALL connections. If one
client's send() blocks (TCP window full, slow network), every other write
thread stalls — no data flows to any player until the slow send completes.
Each PlayerConnection has its own write thread, so with 8+ players one slow
client can cause latency spikes or timeout disconnects for healthy players.
Fix: replace s_sendLock with per-socket locks indexed by smallId. The lock
only needs to prevent header+payload interleaving on the SAME socket; sends
to different sockets are independent. Deferred to a separate PR to keep
this one focused.

Textures::releaseTexture: early return for id <= 0, checks
TextureGetTexture(id) != NULL before calling glDeleteTextures. Prevents
crashes on stale texture IDs after RenderManager reset.

UIController TextureSubstitutionDestroyCallback: null guard on
Minecraft::GetInstance() and mc->textures before calling releaseTexture.
Prevents crash during shutdown.

StringTable: removed __debugbreak() on language load failure in debug builds.
2026-03-08 15:49:50 -05:00

2506 lines
84 KiB
C++

// gdraw_d3d1x_shared.inl - author: Fabian Giesen - copyright 2012 RAD Game Tools
//
// This file implements the part of the Iggy graphics driver layer shared between
// D3D10 and 11 (which is most of it). It heavily depends on a bunch of typedefs,
// #defines and utility functions that need to be set up correctly for the D3D version
// being targeted. This is a bit ugly, but much easier to maintain than the original
// solution, where we just kept two almost identical versions of this code.
// That native handle type holds resource handles and a coarse description.
typedef union {
// handle that is a texture
struct {
ID3D1X(Texture2D) *d3d;
ID3D1X(ShaderResourceView) *d3d_view;
ID3D1X(RenderTargetView) *d3d_rtview;
U32 w, h;
} tex;
// handle that is a vertex buffer
struct {
ID3D1X(Buffer) *verts;
ID3D1X(Buffer) *inds;
} vbuf;
} GDrawNativeHandle;
#define GDRAW_NO_STREAMING_MIPGEN // This renderer doesn't use GDraw-internal mipmap generation
#include "gdraw_shared.inl"
// max rendertarget stack depth. this depends on the extent to which you
// use filters and non-standard blend modes, and how nested they are.
#define MAX_RENDER_STACK_DEPTH 8 // Iggy is hardcoded to a limit of 16... probably 1-3 is realistic
#define AATEX_SAMPLER 7 // sampler that aa_tex gets set in
#define STENCIL_STATE_CACHE_SIZE 32 // number of distinct stencil states we cache DepthStencilStates for
#define QUAD_IB_COUNT 2048 // quad index buffer has indices for this many quads
#define ASSERT_COUNT(a,b) ((a) == (b) ? (b) : -1)
static GDrawFunctions gdraw_funcs;
// render target state
typedef struct
{
GDrawHandle *color_buffer;
S32 base_x, base_y, width, height;
U32 flags;
rrbool cached;
} GDrawFramebufferState;
struct ProgramWithCachedVariableLocations
{
DWORD *bytecode;
union {
DWORD size;
ID3D1X(PixelShader) *pshader;
ID3D1X(VertexShader) *vshader;
};
};
struct DynBuffer
{
ID3D1X(Buffer) *buffer;
U32 size; // size of buffer
U32 write_pos; // start of most recently allocated chunk
U32 alloc_pos; // end of most recently allocated chunk (=start of next allocation)
};
///////////////////////////////////////////////////////////////////////////////
//
// GDraw data structure
//
//
// This is the primary rendering abstraction, which hides all
// the platform-specific rendering behavior from Iggy. It is
// full of platform-specific graphics state, and also general
// graphics state so that it doesn't have to callback into Iggy
// to get at that graphics state.
typedef struct
{
ID3D1XDevice *d3d_device;
ID3D1XContext *d3d_context;
// fragment shaders
ProgramWithCachedVariableLocations fprog[GDRAW_TEXTURE__count][3];
ProgramWithCachedVariableLocations exceptional_blend[GDRAW_BLENDSPECIAL__count];
ProgramWithCachedVariableLocations filter_prog[2][16];
ProgramWithCachedVariableLocations blur_prog[MAX_TAPS+1];
ProgramWithCachedVariableLocations colormatrix;
ProgramWithCachedVariableLocations clear_ps;
// vertex input layouts
ID3D1X(InputLayout) *inlayout[GDRAW_vformat__count];
// vertex shaders
ProgramWithCachedVariableLocations vert[GDRAW_vformat__count]; // [format]
// render targets
GDrawHandleCache rendertargets;
GDrawHandle rendertarget_handles[MAX_RENDER_STACK_DEPTH]; // not -1, because we use +1 to initialize
gswf_recti rt_valid[MAX_RENDER_STACK_DEPTH+1]; // valid rect for texture clamping
// size of framebuffer-sized texture used for implementing blend modes
S32 frametex_width, frametex_height;
// viewport setting (in pixels) for current frame
S32 vx,vy;
S32 fw,fh; // full width/height of virtual display
S32 tw,th; // actual width/height of current tile
S32 tpw,tph; // width/height of padded version of tile
S32 tx0,ty0;
S32 tx0p,ty0p;
rrbool in_blur;
struct {
S32 x,y,w,h;
} cview; // current viewport
F32 projection[4]; // scalex,scaley,transx,transy
F32 projmat[3][4];
F32 xform_3d[3][4];
rrbool use_3d;
ID3D1X(RenderTargetView) *main_framebuffer;
ID3D1X(DepthStencilView) *depth_buffer[2]; // 0=main, 1=rendertarget
ID3D1X(ShaderResourceView) *main_resolve_target;
rrbool main_msaa; // does main framebuffer have MSAA enabled?
ID3D1X(Texture2D) *rt_depth_buffer;
ID3D1X(Texture2D) *aa_tex;
ID3D1X(ShaderResourceView) *aa_tex_view;
ID3D1X(Buffer) *quad_ib; // canned quad indices
// scale factor converting worldspace to viewspace <0,0>..<w,h>
F32 world_to_pixel[2];
// state objects
ID3D1X(RasterizerState) *raster_state[2]; // [msaa]
ID3D1X(SamplerState) *sampler_state[2][GDRAW_WRAP__count]; // [nearest][wrap]
ID3D1X(BlendState) *blend_state[GDRAW_BLEND__count];
ID3D1X(BlendState) *blend_no_color_write;
ID3D1X(DepthStencilState) *depth_state[2][2]; // [set_id][test_id]
// stencil state cache
// SOA so the keys are tightly packed in a few cache lines!
U32 stencil_cache_key[STENCIL_STATE_CACHE_SIZE];
ID3D1X(DepthStencilState) *stencil_cache[STENCIL_STATE_CACHE_SIZE];
U32 stencil_cache_lru[STENCIL_STATE_CACHE_SIZE];
U32 stencil_cache_now;
// constant buffers
ID3D1X(Buffer) *cb_vertex;
ID3D1X(Buffer) *cb_ps_common;
ID3D1X(Buffer) *cb_filter;
ID3D1X(Buffer) *cb_colormatrix;
ID3D1X(Buffer) *cb_blur;
// streaming buffers for dynamic vertex/index data
DynBuffer dyn_vb;
DynBuffer dyn_ib;
U32 dyn_maxalloc, last_dyn_maxalloc;
S32 max_quad_vert_count;
// cached state
U32 scissor_state; // ~0 if unknown, otherwise 0 or 1
S32 blend_mode; // -1 if unknown, otherwise GDRAW_BLEND_*
// render-state stack described above for 'temporary' rendering
GDrawFramebufferState frame[MAX_RENDER_STACK_DEPTH];
GDrawFramebufferState *cur;
// texture and vertex buffer pools
GDrawHandleCache *texturecache;
GDrawHandleCache *vbufcache;
// stat tracking
rrbool frame_done;
U64 frame_counter;
// error handler
void (__cdecl *error_handler)(HRESULT hr);
} GDraw;
static GDraw *gdraw;
static const F32 four_zeros[4] = { 0 }; // used in several places
////////////////////////////////////////////////////////////////////////
//
// General resource management for both textures and vertex buffers
//
template<typename T>
static void safe_release(T *&p)
{
if (p) {
p->Release();
p = NULL;
}
}
static void report_d3d_error(HRESULT hr, const char *call, const char *context)
{
if (hr == E_OUTOFMEMORY)
IggyGDrawSendWarning(NULL, "GDraw D3D out of memory in %s%s", call, context);
else
IggyGDrawSendWarning(NULL, "GDraw D3D error in %s%s: 0x%08x", call, context, hr);
}
static void unbind_resources(void)
{
ID3D1XContext *d3d = gdraw->d3d_context;
// unset active textures and vertex/index buffers,
// to make sure there are no dangling refs
static ID3D1X(ShaderResourceView) *no_views[3] = { 0 };
ID3D1X(Buffer) *no_vb = NULL;
UINT no_offs = 0;
d3d->PSSetShaderResources(0, 3, no_views);
d3d->IASetVertexBuffers(0, 1, &no_vb, &no_offs, &no_offs);
d3d->IASetIndexBuffer(NULL, DXGI_FORMAT_UNKNOWN, 0);
}
static void api_free_resource(GDrawHandle *r)
{
unbind_resources();
if (r->state != GDRAW_HANDLE_STATE_user_owned) {
if (!r->cache->is_vertex) {
safe_release(r->handle.tex.d3d_view);
safe_release(r->handle.tex.d3d_rtview);
safe_release(r->handle.tex.d3d);
} else {
safe_release(r->handle.vbuf.verts);
safe_release(r->handle.vbuf.inds);
}
}
}
static void RADLINK gdraw_UnlockHandles(GDrawStats * /*stats*/)
{
gdraw_HandleCacheUnlockAll(gdraw->texturecache);
gdraw_HandleCacheUnlockAll(gdraw->vbufcache);
}
////////////////////////////////////////////////////////////////////////
//
// Dynamic buffer
//
static void *start_write_dyn(DynBuffer *buf, U32 size)
{
U8 *ptr = NULL;
if (size > buf->size) {
IggyGDrawSendWarning(NULL, "GDraw dynamic vertex buffer usage of %d bytes in one call larger than buffer size %d", size, buf->size);
return NULL;
}
// update statistics
gdraw->dyn_maxalloc = RR_MAX(gdraw->dyn_maxalloc, size);
// invariant: current alloc_pos is in [0,size]
assert(buf->alloc_pos <= buf->size);
// wrap around when less than "size" bytes left in buffer
buf->write_pos = ((buf->size - buf->alloc_pos) < size) ? 0 : buf->alloc_pos;
// discard buffer whenever the current write position is 0;
// done this way so that if a DISCARD Map() were to fail, we would
// just keep retrying the next time around.
ptr = (U8 *) map_buffer(gdraw->d3d_context, buf->buffer, buf->write_pos == 0);
if (ptr) {
ptr += buf->write_pos; // we return pointer to write position in buffer
buf->alloc_pos = buf->write_pos + size; // bump alloc position
assert(buf->alloc_pos <= buf->size); // invariant again
}
// if map_buffer fails, it will have sent a warning
return ptr;
}
static U32 end_write_dyn(DynBuffer *buf)
{
unmap_buffer(gdraw->d3d_context, buf->buffer);
return buf->write_pos;
}
////////////////////////////////////////////////////////////////////////
//
// Stencil state cache
//
static void stencil_state_cache_clear()
{
S32 i;
for (i=0; i < STENCIL_STATE_CACHE_SIZE; ++i) {
gdraw->stencil_cache_key[i] = 0;
safe_release(gdraw->stencil_cache[i]);
gdraw->stencil_cache_lru[i] = 0;
}
gdraw->stencil_cache_now = 0;
}
static ID3D1X(DepthStencilState) *stencil_state_cache_lookup(rrbool set_id, rrbool test_id, U8 read_mask, U8 write_mask)
{
D3D1X_(DEPTH_STENCIL_DESC) desc;
S32 i, best = 0;
U32 key = (set_id << 1) | test_id | (read_mask << 8) | (write_mask << 16);
U32 now, age, highest_age;
HRESULT hr;
// for LRU
now = ++gdraw->stencil_cache_now;
// do we have this in the cache?
for (i=0; i < STENCIL_STATE_CACHE_SIZE; ++i) {
if (gdraw->stencil_cache_key[i] == key) {
gdraw->stencil_cache_lru[i] = now;
return gdraw->stencil_cache[i];
}
}
// not in the cache, find the best slot to replace it with (LRU)
highest_age = 0;
for (i=0; i < STENCIL_STATE_CACHE_SIZE; ++i) {
if (!gdraw->stencil_cache[i]) { // unused slot!
best = i;
break;
}
age = now - gdraw->stencil_cache_lru[i];
if (age > highest_age) {
highest_age = age;
best = i;
}
}
// release old depth/stencil state at that position and create new one
safe_release(gdraw->stencil_cache[best]);
gdraw->depth_state[set_id][test_id]->GetDesc(&desc); // reference state
desc.StencilEnable = TRUE;
desc.StencilReadMask = read_mask;
desc.StencilWriteMask = write_mask;
desc.FrontFace.StencilFailOp = D3D1X_(STENCIL_OP_KEEP);
desc.FrontFace.StencilDepthFailOp = D3D1X_(STENCIL_OP_KEEP);
desc.FrontFace.StencilPassOp = D3D1X_(STENCIL_OP_REPLACE);
desc.FrontFace.StencilFunc = D3D1X_(COMPARISON_EQUAL);
desc.BackFace.StencilFailOp = D3D1X_(STENCIL_OP_KEEP);
desc.BackFace.StencilDepthFailOp = D3D1X_(STENCIL_OP_KEEP);
desc.BackFace.StencilPassOp = D3D1X_(STENCIL_OP_REPLACE);
desc.BackFace.StencilFunc = D3D1X_(COMPARISON_EQUAL);
hr = gdraw->d3d_device->CreateDepthStencilState(&desc, &gdraw->stencil_cache[best]);
if (FAILED(hr))
report_d3d_error(hr, "CreateDepthStencilState", "");
gdraw->stencil_cache_key[best] = key;
gdraw->stencil_cache_lru[best] = now;
return gdraw->stencil_cache[best];
}
////////////////////////////////////////////////////////////////////////
//
// Texture creation/updating/deletion
//
extern GDrawTexture *gdraw_D3D1X_(WrappedTextureCreate)(ID3D1X(ShaderResourceView) *tex_view)
{
GDrawStats stats={0};
GDrawHandle *p = gdraw_res_alloc_begin(gdraw->texturecache, 0, &stats); // it may need to free one item to give us a handle
p->handle.tex.d3d = NULL;
p->handle.tex.d3d_view = tex_view;
p->handle.tex.d3d_rtview = NULL;
p->handle.tex.w = 1;
p->handle.tex.h = 1;
gdraw_HandleCacheAllocateEnd(p, 0, NULL, GDRAW_HANDLE_STATE_user_owned);
return (GDrawTexture *) p;
}
extern void gdraw_D3D1X_(WrappedTextureChange)(GDrawTexture *tex, ID3D1X(ShaderResourceView) *tex_view)
{
GDrawHandle *p = (GDrawHandle *) tex;
p->handle.tex.d3d = NULL;
p->handle.tex.d3d_view = tex_view;
}
extern void gdraw_D3D1X_(WrappedTextureDestroy)(GDrawTexture *tex)
{
GDrawStats stats={0};
gdraw_res_free((GDrawHandle *) tex, &stats);
}
static void RADLINK gdraw_SetTextureUniqueID(GDrawTexture *tex, void *old_id, void *new_id)
{
GDrawHandle *p = (GDrawHandle *) tex;
// if this is still the handle it's thought to be, change the owner;
// if the owner *doesn't* match, then they're changing a stale handle, so ignore
if (p->owner == old_id)
p->owner = new_id;
}
static rrbool RADLINK gdraw_MakeTextureBegin(void *owner, S32 width, S32 height, gdraw_texture_format format, U32 flags, GDraw_MakeTexture_ProcessingInfo *p, GDrawStats *stats)
{
GDrawHandle *t = NULL;
DXGI_FORMAT dxgi_fmt;
S32 bpp, size = 0, nmips = 0;
if (width >= 16384 || height >= 16384) {
IggyGDrawSendWarning(NULL, "GDraw texture size too large (%d x %d), dimension limit is 16384", width, height);
return false;
}
if (format == GDRAW_TEXTURE_FORMAT_rgba32) {
dxgi_fmt = DXGI_FORMAT_R8G8B8A8_UNORM;
bpp = 4;
} else {
dxgi_fmt = DXGI_FORMAT_R8_UNORM;
bpp = 1;
}
// compute estimated size of texture in video memory
do {
size += RR_MAX(width >> nmips, 1) * RR_MAX(height >> nmips, 1) * bpp;
++nmips;
} while ((flags & GDRAW_MAKETEXTURE_FLAGS_mipmap) && ((width >> nmips) || (height >> nmips)));
// try to allocate memory for the client to write to
p->texture_data = (U8 *) IggyGDrawMalloc(size);
if (!p->texture_data) {
IggyGDrawSendWarning(NULL, "GDraw out of memory to store texture data to pass to D3D for %d x %d texture", width, height);
return false;
}
// allocate a handle and make room in the cache for this much data
t = gdraw_res_alloc_begin(gdraw->texturecache, size, stats);
if (!t) {
IggyGDrawFree(p->texture_data);
return false;
}
t->handle.tex.w = width;
t->handle.tex.h = height;
t->handle.tex.d3d = NULL;
t->handle.tex.d3d_view = NULL;
t->handle.tex.d3d_rtview = NULL;
p->texture_type = GDRAW_TEXTURE_TYPE_rgba;
p->p0 = t;
p->p1 = owner;
p->i0 = width;
p->i1 = height;
p->i2 = flags;
p->i3 = dxgi_fmt;
p->i4 = size;
p->i5 = nmips;
p->i6 = bpp;
p->stride_in_bytes = width * bpp;
p->num_rows = height;
return true;
}
static rrbool RADLINK gdraw_MakeTextureMore(GDraw_MakeTexture_ProcessingInfo * /*p*/)
{
return false;
}
static GDrawTexture * RADLINK gdraw_MakeTextureEnd(GDraw_MakeTexture_ProcessingInfo *p, GDrawStats *stats)
{
GDrawHandle *t = (GDrawHandle *) p->p0;
D3D1X_(SUBRESOURCE_DATA) mipdata[24];
S32 i, w, h, nmips, bpp;
HRESULT hr = S_OK;
const char *failed_call;
U8 *ptr;
// generate mip maps and set up descriptors for them
assert(p->i5 <= 24);
ptr = p->texture_data;
w = p->i0;
h = p->i1;
nmips = p->i5;
bpp = p->i6;
for (i=0; i < nmips; ++i) {
mipdata[i].pSysMem = ptr;
mipdata[i].SysMemPitch = RR_MAX(w >> i, 1) * bpp;
mipdata[i].SysMemSlicePitch = 0;
ptr += mipdata[i].SysMemPitch * RR_MAX(h >> i, 1);
// create mip data by downsampling
if (i)
gdraw_Downsample((U8 *) mipdata[i].pSysMem, mipdata[i].SysMemPitch, w >> i, h >> i,
(U8 *) mipdata[i-1].pSysMem, mipdata[i-1].SysMemPitch, bpp);
}
// actually create texture
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(w), static_cast<U32>(h), static_cast<U32>(nmips), 1, static_cast<DXGI_FORMAT>(p->i3), { 1, 0 },
(p->i2 & GDRAW_MAKETEXTURE_FLAGS_updatable) ? D3D1X_(USAGE_DEFAULT) : D3D1X_(USAGE_IMMUTABLE),
D3D1X_(BIND_SHADER_RESOURCE), 0, 0 };
failed_call = "CreateTexture2D";
hr = gdraw->d3d_device->CreateTexture2D(&desc, mipdata, &t->handle.tex.d3d);
if (FAILED(hr)) goto done;
// and create a corresponding shader resource view
failed_call = "CreateShaderResourceView";
hr = gdraw->d3d_device->CreateShaderResourceView(t->handle.tex.d3d, NULL, &t->handle.tex.d3d_view);
done:
if (!FAILED(hr)) {
gdraw_HandleCacheAllocateEnd(t, p->i4, p->p1, (p->i2 & GDRAW_MAKETEXTURE_FLAGS_never_flush) ? GDRAW_HANDLE_STATE_pinned : GDRAW_HANDLE_STATE_locked);
stats->nonzero_flags |= GDRAW_STATS_alloc_tex;
stats->alloc_tex += 1;
stats->alloc_tex_bytes += p->i4;
} else {
safe_release(t->handle.tex.d3d);
safe_release(t->handle.tex.d3d_view);
gdraw_HandleCacheAllocateFail(t);
t = NULL;
report_d3d_error(hr, failed_call, " while creating texture");
}
IggyGDrawFree(p->texture_data);
return (GDrawTexture *) t;
}
static rrbool RADLINK gdraw_UpdateTextureBegin(GDrawTexture *t, void *unique_id, GDrawStats * /*stats*/)
{
return gdraw_HandleCacheLock((GDrawHandle *) t, unique_id);
}
static void RADLINK gdraw_UpdateTextureRect(GDrawTexture *t, void * /*unique_id*/, S32 x, S32 y, S32 stride, S32 w, S32 h, U8 *samples, gdraw_texture_format /*format*/)
{
GDrawHandle *s = (GDrawHandle *) t;
D3D1X_(BOX) box = { static_cast<U32>(x), static_cast<U32>(y), 0U, static_cast<U32>(x + w), static_cast<U32>(y + h), 1U };
gdraw->d3d_context->UpdateSubresource(s->handle.tex.d3d, 0, &box, samples, stride, 0);
}
static void RADLINK gdraw_UpdateTextureEnd(GDrawTexture *t, void * /*unique_id*/, GDrawStats * /*stats*/)
{
gdraw_HandleCacheUnlock((GDrawHandle *) t);
}
static void RADLINK gdraw_FreeTexture(GDrawTexture *tt, void *unique_id, GDrawStats *stats)
{
GDrawHandle *t = (GDrawHandle *) tt;
assert(t != NULL); // @GDRAW_ASSERT
if (t->owner == unique_id || unique_id == NULL) {
if (t->cache == &gdraw->rendertargets) {
gdraw_HandleCacheUnlock(t);
// cache it by simply not freeing it
return;
}
gdraw_res_free(t, stats);
}
}
static rrbool RADLINK gdraw_TryToLockTexture(GDrawTexture *t, void *unique_id, GDrawStats * /*stats*/)
{
return gdraw_HandleCacheLock((GDrawHandle *) t, unique_id);
}
static void RADLINK gdraw_DescribeTexture(GDrawTexture *tex, GDraw_Texture_Description *desc)
{
GDrawHandle *p = (GDrawHandle *) tex;
desc->width = p->handle.tex.w;
desc->height = p->handle.tex.h;
desc->size_in_bytes = p->bytes;
}
static void RADLINK gdraw_SetAntialiasTexture(S32 width, U8 *rgba)
{
HRESULT hr;
safe_release(gdraw->aa_tex_view);
safe_release(gdraw->aa_tex);
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(width), 1U, 1U, 1U, DXGI_FORMAT_R8G8B8A8_UNORM, { 1, 0 }, D3D1X_(USAGE_IMMUTABLE), D3D1X_(BIND_SHADER_RESOURCE), 0U, 0U };
D3D1X_(SUBRESOURCE_DATA) data = { rgba, static_cast<U32>(width) * 4U, 0U };
hr = gdraw->d3d_device->CreateTexture2D(&desc, &data, &gdraw->aa_tex);
if (FAILED(hr)) {
report_d3d_error(hr, "CreateTexture2D", "");
return;
}
hr = gdraw->d3d_device->CreateShaderResourceView(gdraw->aa_tex, NULL, &gdraw->aa_tex_view);
if (FAILED(hr)) {
report_d3d_error(hr, "CreateShaderResourceView", " while creating texture");
safe_release(gdraw->aa_tex);
return;
}
}
////////////////////////////////////////////////////////////////////////
//
// Vertex buffer creation/deletion
//
static rrbool RADLINK gdraw_MakeVertexBufferBegin(void *unique_id, gdraw_vformat /*vformat*/, S32 vbuf_size, S32 ibuf_size, GDraw_MakeVertexBuffer_ProcessingInfo *p, GDrawStats *stats)
{
// prepare staging buffers for the app to put data into
p->vertex_data = (U8 *) IggyGDrawMalloc(vbuf_size);
p->index_data = (U8 *) IggyGDrawMalloc(ibuf_size);
if (p->vertex_data && p->index_data) {
GDrawHandle *vb = gdraw_res_alloc_begin(gdraw->vbufcache, vbuf_size + ibuf_size, stats);
if (vb) {
vb->handle.vbuf.verts = NULL;
vb->handle.vbuf.inds = NULL;
p->vertex_data_length = vbuf_size;
p->index_data_length = ibuf_size;
p->p0 = vb;
p->p1 = unique_id;
return true;
}
}
if (p->vertex_data)
IggyGDrawFree(p->vertex_data);
if (p->index_data)
IggyGDrawFree(p->index_data);
return false;
}
static rrbool RADLINK gdraw_MakeVertexBufferMore(GDraw_MakeVertexBuffer_ProcessingInfo * /*p*/)
{
assert(0);
return false;
}
static GDrawVertexBuffer * RADLINK gdraw_MakeVertexBufferEnd(GDraw_MakeVertexBuffer_ProcessingInfo *p, GDrawStats * /*stats*/)
{
GDrawHandle *vb = (GDrawHandle *) p->p0;
HRESULT hr;
D3D1X_(BUFFER_DESC) vbdesc = { static_cast<U32>(p->vertex_data_length), D3D1X_(USAGE_IMMUTABLE), D3D1X_(BIND_VERTEX_BUFFER), 0U, 0U };
D3D1X_(SUBRESOURCE_DATA) vbdata = { p->vertex_data, 0, 0 };
D3D1X_(BUFFER_DESC) ibdesc = { static_cast<U32>(p->index_data_length), D3D1X_(USAGE_IMMUTABLE), D3D1X_(BIND_INDEX_BUFFER), 0U, 0U };
D3D1X_(SUBRESOURCE_DATA) ibdata = { p->index_data, 0, 0 };
hr = gdraw->d3d_device->CreateBuffer(&vbdesc, &vbdata, &vb->handle.vbuf.verts);
if (!FAILED(hr))
hr = gdraw->d3d_device->CreateBuffer(&ibdesc, &ibdata, &vb->handle.vbuf.inds);
if (FAILED(hr)) {
safe_release(vb->handle.vbuf.verts);
safe_release(vb->handle.vbuf.inds);
gdraw_HandleCacheAllocateFail(vb);
vb = NULL;
report_d3d_error(hr, "CreateBuffer", " creating vertex buffer");
} else {
gdraw_HandleCacheAllocateEnd(vb, p->vertex_data_length + p->index_data_length, p->p1, GDRAW_HANDLE_STATE_locked);
}
IggyGDrawFree(p->vertex_data);
IggyGDrawFree(p->index_data);
return (GDrawVertexBuffer *) vb;
}
static rrbool RADLINK gdraw_TryLockVertexBuffer(GDrawVertexBuffer *vb, void *unique_id, GDrawStats * /*stats*/)
{
return gdraw_HandleCacheLock((GDrawHandle *) vb, unique_id);
}
static void RADLINK gdraw_FreeVertexBuffer(GDrawVertexBuffer *vb, void *unique_id, GDrawStats *stats)
{
GDrawHandle *h = (GDrawHandle *) vb;
assert(h != NULL); // @GDRAW_ASSERT
if (h->owner == unique_id)
gdraw_res_free(h, stats);
}
static void RADLINK gdraw_DescribeVertexBuffer(GDrawVertexBuffer *vbuf, GDraw_VertexBuffer_Description *desc)
{
GDrawHandle *p = (GDrawHandle *) vbuf;
desc->size_in_bytes = p->bytes;
}
////////////////////////////////////////////////////////////////////////
//
// Create/free (or cache) render targets
//
static GDrawHandle *get_color_rendertarget(GDrawStats *stats)
{
const char *failed_call;
// try to recycle LRU rendertarget
GDrawHandle *t = gdraw_HandleCacheGetLRU(&gdraw->rendertargets);
if (t) {
gdraw_HandleCacheLock(t, (void *) 1);
return t;
}
// ran out of RTs, allocate a new one
S32 size = gdraw->frametex_width * gdraw->frametex_height * 4;
if (gdraw->rendertargets.bytes_free < size) {
IggyGDrawSendWarning(NULL, "GDraw rendertarget allocation failed: hit size limit of %d bytes", gdraw->rendertargets.total_bytes);
return NULL;
}
t = gdraw_HandleCacheAllocateBegin(&gdraw->rendertargets);
if (!t) {
IggyGDrawSendWarning(NULL, "GDraw rendertarget allocation failed: hit handle limit");
return t;
}
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(gdraw->frametex_width), static_cast<U32>(gdraw->frametex_height), 1U, 1U, DXGI_FORMAT_R8G8B8A8_UNORM, { 1, 0 },
D3D1X_(USAGE_DEFAULT), D3D1X_(BIND_SHADER_RESOURCE) | D3D1X_(BIND_RENDER_TARGET), 0U, 0U };
t->handle.tex.d3d = NULL;
t->handle.tex.d3d_view = NULL;
t->handle.tex.d3d_rtview = NULL;
HRESULT hr = gdraw->d3d_device->CreateTexture2D(&desc, NULL, &t->handle.tex.d3d);
failed_call = "CreateTexture2D";
if (!FAILED(hr)) {
hr = gdraw->d3d_device->CreateShaderResourceView(t->handle.tex.d3d, NULL, &t->handle.tex.d3d_view);
failed_call = "CreateTexture2D";
}
if (!FAILED(hr)) {
hr = gdraw->d3d_device->CreateRenderTargetView(t->handle.tex.d3d, NULL, &t->handle.tex.d3d_rtview);
failed_call = "CreateRenderTargetView";
}
if (FAILED(hr)) {
safe_release(t->handle.tex.d3d);
safe_release(t->handle.tex.d3d_view);
safe_release(t->handle.tex.d3d_rtview);
gdraw_HandleCacheAllocateFail(t);
report_d3d_error(hr, failed_call, " creating rendertarget");
return NULL;
}
gdraw_HandleCacheAllocateEnd(t, size, (void *) 1, GDRAW_HANDLE_STATE_locked);
stats->nonzero_flags |= GDRAW_STATS_alloc_tex;
stats->alloc_tex += 1;
stats->alloc_tex_bytes += size;
return t;
}
static ID3D1X(DepthStencilView) *get_rendertarget_depthbuffer(GDrawStats *stats)
{
if (!gdraw->depth_buffer[1]) {
const char *failed_call;
assert(!gdraw->rt_depth_buffer);
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(gdraw->frametex_width), static_cast<U32>(gdraw->frametex_height), 1U, 1U, DXGI_FORMAT_D24_UNORM_S8_UINT, { 1, 0 },
D3D1X_(USAGE_DEFAULT), D3D1X_(BIND_DEPTH_STENCIL), 0U, 0U };
HRESULT hr = gdraw->d3d_device->CreateTexture2D(&desc, NULL, &gdraw->rt_depth_buffer);
failed_call = "CreateTexture2D";
if (!FAILED(hr)) {
hr = gdraw->d3d_device->CreateDepthStencilView(gdraw->rt_depth_buffer, NULL, &gdraw->depth_buffer[1]);
failed_call = "CreateDepthStencilView while creating rendertarget";
}
if (FAILED(hr)) {
report_d3d_error(hr, failed_call, "");
safe_release(gdraw->rt_depth_buffer);
safe_release(gdraw->depth_buffer[1]);
} else {
stats->nonzero_flags |= GDRAW_STATS_alloc_tex;
stats->alloc_tex += 1;
stats->alloc_tex_bytes += gdraw->frametex_width * gdraw->frametex_height * 4;
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[1], D3D1X_(CLEAR_DEPTH) | D3D1X_(CLEAR_STENCIL), 1.0f, 0);
}
}
return gdraw->depth_buffer[1];
}
static void flush_rendertargets(GDrawStats *stats)
{
gdraw_res_flush(&gdraw->rendertargets, stats);
safe_release(gdraw->depth_buffer[1]);
safe_release(gdraw->rt_depth_buffer);
}
////////////////////////////////////////////////////////////////////////
//
// Constant buffer layouts
//
struct VertexVars
{
F32 world[2][4];
F32 x_off[4];
F32 texgen_s[4];
F32 texgen_t[4];
F32 x3d[4];
F32 y3d[4];
F32 w3d[4];
};
struct PixelCommonVars
{
F32 color_mul[4];
F32 color_add[4];
F32 focal[4];
F32 rescale1[4];
};
struct PixelParaFilter
{
F32 clamp0[4], clamp1[4];
F32 color[4], color2[4];
F32 tc_off[4];
};
struct PixelParaBlur
{
F32 clamp[4];
F32 tap[9][4];
};
struct PixelParaColorMatrix
{
F32 data[5][4];
};
////////////////////////////////////////////////////////////////////////
//
// Rendering helpers
//
static void disable_scissor(int force)
{
if (force || gdraw->scissor_state) {
// disable scissor by setting whole viewport as scissor rect
S32 x = gdraw->cview.x;
S32 y = gdraw->cview.y;
D3D1X_(RECT) r = { x, y, x + gdraw->cview.w, y + gdraw->cview.h };
gdraw->d3d_context->RSSetScissorRects(1, &r);
gdraw->scissor_state = 0;
}
}
static void set_viewport_raw(S32 x, S32 y, S32 w, S32 h)
{
D3D1X_(VIEWPORT) vp = { (ViewCoord) x, (ViewCoord) y, (ViewCoord) w, (ViewCoord) h, 0.0f, 1.0f };
gdraw->d3d_context->RSSetViewports(1, &vp);
gdraw->cview.x = x;
gdraw->cview.y = y;
gdraw->cview.w = w;
gdraw->cview.h = h;
disable_scissor(1);
}
static void set_projection_base(void)
{
// x3d = < viewproj.x, 0, 0, 0 >
// y3d = < 0, viewproj.y, 0, 0 >
// w3d = < viewproj.z, viewproj.w, 1.0, 1.0 >
memset(gdraw->projmat[0], 0, sizeof(gdraw->projmat));
gdraw->projmat[0][0] = gdraw->projection[0];
gdraw->projmat[1][1] = gdraw->projection[1];
gdraw->projmat[2][0] = gdraw->projection[2];
gdraw->projmat[2][1] = gdraw->projection[3];
gdraw->projmat[2][2] = 1.0;
gdraw->projmat[2][3] = 1.0;
}
static void set_projection_raw(S32 x0, S32 x1, S32 y0, S32 y1)
{
gdraw->projection[0] = 2.0f / (x1-x0);
gdraw->projection[1] = 2.0f / (y1-y0);
gdraw->projection[2] = (x1+x0)/(F32)(x0-x1);
gdraw->projection[3] = (y1+y0)/(F32)(y0-y1);
set_projection_base();
}
static void set_viewport(void)
{
if (gdraw->in_blur) {
set_viewport_raw(0, 0, gdraw->tpw, gdraw->tph);
return;
}
if (gdraw->cur == gdraw->frame) // if the rendering stack is empty
// render a tile-sized region to the user-request tile location
set_viewport_raw(gdraw->vx, gdraw->vy, gdraw->tw, gdraw->th);
else if (gdraw->cur->cached)
set_viewport_raw(0, 0, gdraw->cur->width, gdraw->cur->height);
else
// if on the render stack, draw a padded-tile-sized region at the origin
set_viewport_raw(0, 0, gdraw->tpw, gdraw->tph);
}
static void set_projection(void)
{
if (gdraw->in_blur) return;
if (gdraw->cur == gdraw->frame) // if the render stack is empty
set_projection_raw(gdraw->tx0, gdraw->tx0+gdraw->tw, gdraw->ty0+gdraw->th, gdraw->ty0);
else if (gdraw->cur->cached)
set_projection_raw(gdraw->cur->base_x, gdraw->cur->base_x+gdraw->cur->width, gdraw->cur->base_y, gdraw->cur->base_y+gdraw->cur->height);
else
set_projection_raw(gdraw->tx0p, gdraw->tx0p+gdraw->tpw, gdraw->ty0p+gdraw->tph, gdraw->ty0p);
}
static void clear_renderstate(void)
{
gdraw->d3d_context->ClearState();
}
static void set_common_renderstate()
{
ID3D1XContext *d3d = gdraw->d3d_context;
S32 i;
clear_renderstate();
// all the render states we never change while drawing
d3d->IASetPrimitiveTopology(D3D1X_(PRIMITIVE_TOPOLOGY_TRIANGLELIST));
d3d->PSSetShaderResources(7, 1, &gdraw->aa_tex_view);
d3d->PSSetSamplers(7, 1, &gdraw->sampler_state[0][GDRAW_WRAP_clamp]);
// set a well-defined default sampler for all PS textures we use
for (i=0; i < 3; ++i)
d3d->PSSetSamplers(i, 1, &gdraw->sampler_state[0][GDRAW_WRAP_clamp]);
// reset our state caching
gdraw->scissor_state = ~0u;
gdraw->blend_mode = -1;
}
static void manual_clear(gswf_recti *r, GDrawStats *stats);
static void set_render_target(GDrawStats *stats);
////////////////////////////////////////////////////////////////////////
//
// Begin/end rendering of a tile and per-frame processing
//
void gdraw_D3D1X_(SetRendertargetSize)(S32 w, S32 h)
{
if (gdraw && (w != gdraw->frametex_width || h != gdraw->frametex_height)) {
GDrawStats stats = { 0 };
gdraw->frametex_width = w;
gdraw->frametex_height = h;
flush_rendertargets(&stats);
}
}
void gdraw_D3D1X_(SetTileOrigin)(ID3D1X(RenderTargetView) *main_rt, ID3D1X(DepthStencilView) *main_ds, ID3D1X(ShaderResourceView) *non_msaa_rt, S32 x, S32 y)
{
if (!gdraw) return; // AAR - saftey check because windows calls resize early
D3D1X_(RENDER_TARGET_VIEW_DESC) desc;
if (gdraw->frame_done) {
++gdraw->frame_counter;
gdraw->frame_done = false;
}
main_rt->GetDesc(&desc);
gdraw->main_framebuffer = main_rt;
gdraw->main_resolve_target = non_msaa_rt;
gdraw->main_msaa = (desc.ViewDimension == D3D1X_(RTV_DIMENSION_TEXTURE2DMS));
gdraw->depth_buffer[0] = main_ds;
gdraw->vx = x;
gdraw->vy = y;
}
static void RADLINK gdraw_SetViewSizeAndWorldScale(S32 w, S32 h, F32 scalex, F32 scaley)
{
static S32 s_lastW = 0, s_lastH = 0;
static F32 s_lastSx = 0, s_lastSy = 0;
if (w != s_lastW || h != s_lastH || scalex != s_lastSx || scaley != s_lastSy) {
app.DebugPrintf("[GDRAW] SetViewSize: fw=%d fh=%d scale=%.6f,%.6f frametex=%dx%d vx=%d vy=%d\n",
w, h, scalex, scaley, gdraw->frametex_width, gdraw->frametex_height, gdraw->vx, gdraw->vy);
s_lastW = w; s_lastH = h; s_lastSx = scalex; s_lastSy = scaley;
}
memset(gdraw->frame, 0, sizeof(gdraw->frame));
gdraw->cur = gdraw->frame;
gdraw->fw = w;
gdraw->fh = h;
gdraw->tw = w;
gdraw->th = h;
gdraw->world_to_pixel[0] = scalex;
gdraw->world_to_pixel[1] = scaley;
set_viewport();
}
// must include anything necessary for texture creation/update
static void RADLINK gdraw_RenderingBegin(void)
{
}
static void RADLINK gdraw_RenderingEnd(void)
{
}
static void RADLINK gdraw_RenderTileBegin(S32 x0, S32 y0, S32 x1, S32 y1, S32 pad, GDrawStats *stats)
{
if (x0 == 0 && y0 == 0 && x1 == gdraw->fw && y1 == gdraw->fh)
pad = 0;
gdraw->tx0 = x0;
gdraw->ty0 = y0;
gdraw->tw = x1-x0;
gdraw->th = y1-y0;
// padded region
gdraw->tx0p = RR_MAX(x0 - pad, 0);
gdraw->ty0p = RR_MAX(y0 - pad, 0);
gdraw->tpw = RR_MIN(x1 + pad, gdraw->fw) - gdraw->tx0p;
gdraw->tph = RR_MIN(y1 + pad, gdraw->fh) - gdraw->ty0p;
// make sure our rendertargets are large enough to contain the tile
if (gdraw->tpw > gdraw->frametex_width || gdraw->tph > gdraw->frametex_height) {
gdraw->frametex_width = RR_MAX(gdraw->tpw, gdraw->frametex_width);
gdraw->frametex_height = RR_MAX(gdraw->tph, gdraw->frametex_height);
flush_rendertargets(stats);
}
assert(gdraw->tpw <= gdraw->frametex_width && gdraw->tph <= gdraw->frametex_height);
// set up rendertargets we'll use
set_common_renderstate();
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[0], D3D1X_(CLEAR_DEPTH) | D3D1X_(CLEAR_STENCIL), 1.0f, 0);
if (gdraw->depth_buffer[1])
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[1], D3D1X_(CLEAR_DEPTH) | D3D1X_(CLEAR_STENCIL), 1.0f, 0);
set_projection();
set_viewport();
set_render_target(stats);
}
static void RADLINK gdraw_RenderTileEnd(GDrawStats * /*stats*/)
{
}
void gdraw_D3D1X_(NoMoreGDrawThisFrame)(void)
{
clear_renderstate();
gdraw->frame_done = true;
gdraw->last_dyn_maxalloc = gdraw->dyn_maxalloc;
gdraw->dyn_maxalloc = 0;
// reset dynamic buffer alloc position so they get DISCARDed
// next time around.
gdraw->dyn_vb.alloc_pos = 0;
gdraw->dyn_ib.alloc_pos = 0;
GDrawFence now = { gdraw->frame_counter };
gdraw_HandleCacheTick(gdraw->texturecache, now);
gdraw_HandleCacheTick(gdraw->vbufcache, now);
}
#define MAX_DEPTH_VALUE (1 << 13)
static void RADLINK gdraw_GetInfo(GDrawInfo *d)
{
d->num_stencil_bits = 8;
d->max_id = MAX_DEPTH_VALUE-2;
// for floating point depth, just use mantissa, e.g. 16-20 bits
d->buffer_format = GDRAW_BFORMAT_vbib;
d->shared_depth_stencil = 1;
d->always_mipmap = 1;
#ifndef GDRAW_D3D11_LEVEL9
d->max_texture_size = 8192;
d->conditional_nonpow2 = 0;
#else
d->max_texture_size = 2048;
d->conditional_nonpow2 = 1;
#endif
}
////////////////////////////////////////////////////////////////////////
//
// Enable/disable rendertargets in stack fashion
//
static ID3D1X(RenderTargetView) *get_active_render_target()
{
if (gdraw->cur->color_buffer) {
unbind_resources(); // to make sure this RT isn't accidentally set as a texture (avoid D3D warnings)
return gdraw->cur->color_buffer->handle.tex.d3d_rtview;
} else
return gdraw->main_framebuffer;
}
static void set_render_target(GDrawStats *stats)
{
ID3D1X(RenderTargetView) *target = get_active_render_target();
if (target == gdraw->main_framebuffer) {
gdraw->d3d_context->OMSetRenderTargets(1, &target, gdraw->depth_buffer[0]);
gdraw->d3d_context->RSSetState(gdraw->raster_state[gdraw->main_msaa]);
} else {
ID3D1X(DepthStencilView) *depth = NULL;
if (gdraw->cur->flags & (GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_id | GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_stencil))
depth = get_rendertarget_depthbuffer(stats);
gdraw->d3d_context->OMSetRenderTargets(1, &target, depth);
gdraw->d3d_context->RSSetState(gdraw->raster_state[0]);
}
stats->nonzero_flags |= GDRAW_STATS_rendtarg;
stats->rendertarget_changes += 1;
}
static rrbool RADLINK gdraw_TextureDrawBufferBegin(gswf_recti *region, gdraw_texture_format /*format*/, U32 flags, void *owner, GDrawStats *stats)
{
GDrawFramebufferState *n = gdraw->cur+1;
GDrawHandle *t = NULL;
if (gdraw->tw == 0 || gdraw->th == 0) {
IggyGDrawSendWarning(NULL, "GDraw warning: w=0,h=0 rendertarget");
return false;
}
if (n >= &gdraw->frame[MAX_RENDER_STACK_DEPTH]) {
assert(0);
IggyGDrawSendWarning(NULL, "GDraw rendertarget nesting exceeds MAX_RENDER_STACK_DEPTH");
return false;
}
if (owner) {
// nyi
} else {
t = get_color_rendertarget(stats);
if (!t)
return false;
}
n->flags = flags;
n->color_buffer = t;
assert(n->color_buffer != NULL); // @GDRAW_ASSERT
++gdraw->cur;
gdraw->cur->cached = owner != NULL;
if (owner) {
gdraw->cur->base_x = region->x0;
gdraw->cur->base_y = region->y0;
gdraw->cur->width = region->x1 - region->x0;
gdraw->cur->height = region->y1 - region->y0;
}
set_render_target(stats);
assert(gdraw->frametex_width >= gdraw->tw && gdraw->frametex_height >= gdraw->th); // @GDRAW_ASSERT
S32 k = (S32) (t - gdraw->rendertargets.handle);
if (region) {
gswf_recti r;
S32 ox, oy, pad = 2; // 2 pixels of border on all sides
// 1 pixel turns out to be not quite enough with the interpolator precision we get.
if (gdraw->in_blur)
ox = oy = 0;
else
ox = gdraw->tx0p, oy = gdraw->ty0p;
// clamp region to tile
S32 xt0 = RR_MAX(region->x0 - ox, 0);
S32 yt0 = RR_MAX(region->y0 - oy, 0);
S32 xt1 = RR_MIN(region->x1 - ox, gdraw->tpw);
S32 yt1 = RR_MIN(region->y1 - oy, gdraw->tph);
// but the padding needs to clamp to render target bounds
r.x0 = RR_MAX(xt0 - pad, 0);
r.y0 = RR_MAX(yt0 - pad, 0);
r.x1 = RR_MIN(xt1 + pad, gdraw->frametex_width);
r.y1 = RR_MIN(yt1 + pad, gdraw->frametex_height);
if (r.x1 <= r.x0 || r.y1 <= r.y0) { // region doesn't intersect with current tile
--gdraw->cur;
gdraw_FreeTexture((GDrawTexture *) t, 0, stats);
// note: don't send a warning since this will happen during regular tiled rendering
return false;
}
manual_clear(&r, stats);
gdraw->rt_valid[k].x0 = xt0;
gdraw->rt_valid[k].y0 = yt0;
gdraw->rt_valid[k].x1 = xt1;
gdraw->rt_valid[k].y1 = yt1;
} else {
gdraw->d3d_context->ClearRenderTargetView(gdraw->cur->color_buffer->handle.tex.d3d_rtview, four_zeros);
gdraw->rt_valid[k].x0 = 0;
gdraw->rt_valid[k].y0 = 0;
gdraw->rt_valid[k].x1 = gdraw->frametex_width;
gdraw->rt_valid[k].y1 = gdraw->frametex_height;
}
if (!gdraw->in_blur) {
set_viewport();
set_projection();
} else {
set_viewport_raw(0, 0, gdraw->tpw, gdraw->tph);
set_projection_raw(0, gdraw->tpw, gdraw->tph, 0);
}
return true;
}
static GDrawTexture *RADLINK gdraw_TextureDrawBufferEnd(GDrawStats *stats)
{
GDrawFramebufferState *n = gdraw->cur;
GDrawFramebufferState *m = --gdraw->cur;
if (gdraw->tw == 0 || gdraw->th == 0) return 0;
if (n >= &gdraw->frame[MAX_RENDER_STACK_DEPTH])
return 0; // already returned a warning in Begin
assert(m >= gdraw->frame); // bug in Iggy -- unbalanced
if (m != gdraw->frame) {
assert(m->color_buffer != NULL); // @GDRAW_ASSERT
}
assert(n->color_buffer != NULL); // @GDRAW_ASSERT
// switch back to old render target
set_render_target(stats);
// if we're at the root, set the viewport back
set_viewport();
set_projection();
return (GDrawTexture *) n->color_buffer;
}
////////////////////////////////////////////////////////////////////////
//
// Clear stencil/depth buffers
//
// Open question whether we'd be better off finding bounding boxes
// and only clearing those; it depends exactly how fast clearing works.
//
static void RADLINK gdraw_ClearStencilBits(U32 /*bits*/)
{
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[0], D3D1X_(CLEAR_STENCIL), 1.0f, 0);
if (gdraw->depth_buffer[1])
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[1], D3D1X_(CLEAR_STENCIL), 1.0f, 0);
}
// this only happens rarely (hopefully never) if we use the depth buffer,
// so we can just clear the whole thing
static void RADLINK gdraw_ClearID(void)
{
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[0], D3D1X_(CLEAR_DEPTH), 1.0f, 0);
if (gdraw->depth_buffer[1])
gdraw->d3d_context->ClearDepthStencilView(gdraw->depth_buffer[1], D3D1X_(CLEAR_DEPTH), 1.0f, 0);
}
////////////////////////////////////////////////////////////////////////
//
// Set all the render state from GDrawRenderState
//
// This also is responsible for getting the framebuffer into a texture
// if the read-modify-write blend operation can't be expressed with
// the native blend operators. (E.g. "screen")
//
// convert an ID request to a value suitable for the depth buffer,
// assuming the depth buffer has been mappped to 0..1
static F32 depth_from_id(S32 id)
{
return 1.0f - ((F32) id + 1.0f) / MAX_DEPTH_VALUE;
}
static void set_texture(S32 texunit, GDrawTexture *tex, rrbool nearest, S32 wrap)
{
ID3D1XContext *d3d = gdraw->d3d_context;
if (tex == NULL) {
ID3D1X(ShaderResourceView) *notex = NULL;
d3d->PSSetShaderResources(texunit, 1, &notex);
} else {
GDrawHandle *h = (GDrawHandle *) tex;
d3d->PSSetShaderResources(texunit, 1, &h->handle.tex.d3d_view);
d3d->PSSetSamplers(texunit, 1, &gdraw->sampler_state[nearest][wrap]);
}
}
static void RADLINK gdraw_Set3DTransform(F32 *mat)
{
if (mat == NULL)
gdraw->use_3d = 0;
else {
gdraw->use_3d = 1;
memcpy(gdraw->xform_3d, mat, sizeof(gdraw->xform_3d));
}
}
static int set_renderstate_full(S32 vertex_format, GDrawRenderState *r, GDrawStats * /* stats */, const F32 *rescale1)
{
ID3D1XContext *d3d = gdraw->d3d_context;
// set vertex shader
set_vertex_shader(d3d, gdraw->vert[vertex_format].vshader);
// set vertex shader constants
if (VertexVars *vvars = (VertexVars *) map_buffer(gdraw->d3d_context, gdraw->cb_vertex, true)) {
F32 depth = depth_from_id(r->id);
if (!r->use_world_space)
gdraw_ObjectSpace(vvars->world[0], r->o2w, depth, 0.0f);
else
gdraw_WorldSpace(vvars->world[0], gdraw->world_to_pixel, depth, 0.0f);
memcpy(&vvars->x_off, r->edge_matrix, 4*sizeof(F32));
if (r->texgen0_enabled) {
memcpy(&vvars->texgen_s, r->s0_texgen, 4*sizeof(F32));
memcpy(&vvars->texgen_t, r->t0_texgen, 4*sizeof(F32));
}
if (gdraw->use_3d)
memcpy(vvars->x3d, gdraw->xform_3d, 12*sizeof(F32));
else
memcpy(vvars->x3d, gdraw->projmat, 12*sizeof(F32));
unmap_buffer(gdraw->d3d_context, gdraw->cb_vertex);
d3d->VSSetConstantBuffers(0, 1, &gdraw->cb_vertex);
}
// set the blend mode
int blend_mode = r->blend_mode;
if (blend_mode != gdraw->blend_mode) {
gdraw->blend_mode = blend_mode;
d3d->OMSetBlendState(gdraw->blend_state[blend_mode], four_zeros, ~0u);
}
// set the fragment program
if (blend_mode != GDRAW_BLEND_special) {
int which = r->tex0_mode;
assert(which >= 0 && which < sizeof(gdraw->fprog) / sizeof(*gdraw->fprog));
int additive = 0;
if (r->cxf_add) {
additive = 1;
if (r->cxf_add[3]) additive = 2;
}
ID3D1X(PixelShader) *program = gdraw->fprog[which][additive].pshader;
if (r->stencil_set) {
// in stencil set mode, prefer not doing any shading at all
// but if alpha test is on, we need to make an exception
#ifndef GDRAW_D3D11_LEVEL9 // level9 can't do NULL PS it seems
if (which != GDRAW_TEXTURE_alpha_test)
program = NULL;
else
#endif
{
gdraw->blend_mode = -1;
d3d->OMSetBlendState(gdraw->blend_no_color_write, four_zeros, ~0u);
}
}
set_pixel_shader(d3d, program);
} else
set_pixel_shader(d3d, gdraw->exceptional_blend[r->special_blend].pshader);
set_texture(0, r->tex[0], r->nearest0, r->wrap0);
// pixel shader constants
if (PixelCommonVars *pvars = (PixelCommonVars *) map_buffer(gdraw->d3d_context, gdraw->cb_ps_common, true)) {
memcpy(pvars->color_mul, r->color, 4*sizeof(float));
if (r->cxf_add) {
pvars->color_add[0] = r->cxf_add[0] / 255.0f;
pvars->color_add[1] = r->cxf_add[1] / 255.0f;
pvars->color_add[2] = r->cxf_add[2] / 255.0f;
pvars->color_add[3] = r->cxf_add[3] / 255.0f;
} else
pvars->color_add[0] = pvars->color_add[1] = pvars->color_add[2] = pvars->color_add[3] = 0.0f;
if (r->tex0_mode == GDRAW_TEXTURE_focal_gradient) memcpy(pvars->focal, r->focal_point, 4*sizeof(float));
if (r->blend_mode == GDRAW_BLEND_special) memcpy(pvars->rescale1, rescale1, 4*sizeof(float));
unmap_buffer(gdraw->d3d_context, gdraw->cb_ps_common);
d3d->PSSetConstantBuffers(0, 1, &gdraw->cb_ps_common);
}
// Set pixel operation states
if (r->scissor) {
D3D1X_(RECT) s;
gdraw->scissor_state = 1;
if (gdraw->cur == gdraw->frame) {
s.left = r->scissor_rect.x0 + gdraw->vx - gdraw->tx0;
s.top = r->scissor_rect.y0 + gdraw->vy - gdraw->ty0;
s.right = r->scissor_rect.x1 + gdraw->vx - gdraw->tx0;
s.bottom = r->scissor_rect.y1 + gdraw->vy - gdraw->ty0;
} else {
s.left = r->scissor_rect.x0 - gdraw->tx0p;
s.top = r->scissor_rect.y0 - gdraw->ty0p;
s.right = r->scissor_rect.x1 - gdraw->tx0p;
s.bottom = r->scissor_rect.y1 - gdraw->ty0p;
}
d3d->RSSetScissorRects(1, &s);
} else if (r->scissor != gdraw->scissor_state)
disable_scissor(0);
if (r->stencil_set | r->stencil_test)
d3d->OMSetDepthStencilState(stencil_state_cache_lookup(r->set_id, r->test_id, r->stencil_test, r->stencil_set), 255);
else
d3d->OMSetDepthStencilState(gdraw->depth_state[r->set_id][r->test_id], 0);
return 1;
}
static RADINLINE int set_renderstate(S32 vertex_format, GDrawRenderState *r, GDrawStats *stats)
{
static const F32 unit_rescale[4] = { 1.0f, 1.0f, 0.0f, 0.0f };
if (r->identical_state) {
// fast path: only need to change vertex shader, other state is the same
set_vertex_shader(gdraw->d3d_context, gdraw->vert[vertex_format].vshader);
return 1;
} else
return set_renderstate_full(vertex_format, r, stats, unit_rescale);
}
////////////////////////////////////////////////////////////////////////
//
// Vertex formats
//
static D3D1X_(INPUT_ELEMENT_DESC) vformat_v2[] = {
{ "POSITION", 0, DXGI_FORMAT_R32G32_FLOAT, 0, 0, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
};
static D3D1X_(INPUT_ELEMENT_DESC) vformat_v2aa[] = {
{ "POSITION", 0, DXGI_FORMAT_R32G32_FLOAT, 0, 0, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
{ "TEXCOORD", 0, DXGI_FORMAT_R16G16B16A16_SINT, 0, 8, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
};
static D3D1X_(INPUT_ELEMENT_DESC) vformat_v2tc2[] = {
{ "POSITION", 0, DXGI_FORMAT_R32G32_FLOAT, 0, 0, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
{ "TEXCOORD", 0, DXGI_FORMAT_R32G32_FLOAT, 0, 8, D3D1X_(INPUT_PER_VERTEX_DATA), 0 },
};
static struct gdraw_vertex_format_desc {
D3D1X_(INPUT_ELEMENT_DESC) *desc;
U32 nelem;
} vformats[ASSERT_COUNT(GDRAW_vformat__basic_count, 3)] = {
vformat_v2, 1, // GDRAW_vformat_v2
vformat_v2aa, 2, // GDRAW_vformat_v2aa
vformat_v2tc2, 2, // GDRAW_vforamt_v2tc2
};
static int vertsize[GDRAW_vformat__basic_count] = {
8, // GDRAW_vformat_v2
16, // GDRAW_vformat_v2aa
16, // GDRAW_vformat_v2tc2
};
////////////////////////////////////////////////////////////////////////
//
// Draw triangles with a given renderstate
//
static void tag_resources(void *r1, void *r2=NULL, void *r3=NULL, void *r4=NULL)
{
U64 now = gdraw->frame_counter;
if (r1) ((GDrawHandle *) r1)->fence.value = now;
if (r2) ((GDrawHandle *) r2)->fence.value = now;
if (r3) ((GDrawHandle *) r3)->fence.value = now;
if (r4) ((GDrawHandle *) r4)->fence.value = now;
}
static void RADLINK gdraw_DrawIndexedTriangles(GDrawRenderState *r, GDrawPrimitive *p, GDrawVertexBuffer *buf, GDrawStats *stats)
{
ID3D1XContext *d3d = gdraw->d3d_context;
GDrawHandle *vb = (GDrawHandle *) buf;
int vfmt = p->vertex_format;
assert(vfmt >= 0 && vfmt < GDRAW_vformat__count);
if (!set_renderstate(vfmt, r, stats))
return;
UINT stride = vertsize[vfmt];
d3d->IASetInputLayout(gdraw->inlayout[vfmt]);
if (vb) {
UINT offs = (UINT) (UINTa) p->vertices;
d3d->IASetVertexBuffers(0, 1, &vb->handle.vbuf.verts, &stride, &offs);
d3d->IASetIndexBuffer(vb->handle.vbuf.inds, DXGI_FORMAT_R16_UINT, (UINT) (UINTa) p->indices);
d3d->DrawIndexed(p->num_indices, 0, 0);
} else if (p->indices) {
U32 vbytes = p->num_vertices * stride;
U32 ibytes = p->num_indices * 2;
if (void *vbptr = start_write_dyn(&gdraw->dyn_vb, vbytes)) {
memcpy(vbptr, p->vertices, vbytes);
UINT vboffs = end_write_dyn(&gdraw->dyn_vb);
if (void *ibptr = start_write_dyn(&gdraw->dyn_ib, ibytes)) {
memcpy(ibptr, p->indices, ibytes);
UINT iboffs = end_write_dyn(&gdraw->dyn_ib);
d3d->IASetVertexBuffers(0, 1, &gdraw->dyn_vb.buffer, &stride, &vboffs);
d3d->IASetIndexBuffer(gdraw->dyn_ib.buffer, DXGI_FORMAT_R16_UINT, iboffs);
d3d->DrawIndexed(p->num_indices, 0, 0);
}
}
} else { // dynamic quads
assert(p->num_vertices % 4 == 0);
d3d->IASetIndexBuffer(gdraw->quad_ib, DXGI_FORMAT_R16_UINT, 0);
if (gdraw->max_quad_vert_count) {
S32 pos = 0;
while (pos < p->num_vertices) {
S32 vert_count = RR_MIN(p->num_vertices - pos, gdraw->max_quad_vert_count);
UINT chunk_bytes = vert_count * stride;
if (void *vbptr = start_write_dyn(&gdraw->dyn_vb, chunk_bytes)) {
memcpy(vbptr, (U8 *)p->vertices + pos*stride, chunk_bytes);
UINT offs = end_write_dyn(&gdraw->dyn_vb);
d3d->IASetVertexBuffers(0, 1, &gdraw->dyn_vb.buffer, &stride, &offs);
d3d->DrawIndexed((vert_count >> 2) * 6, 0, 0);
}
pos += vert_count;
}
}
}
tag_resources(vb, r->tex[0], r->tex[1]);
stats->nonzero_flags |= GDRAW_STATS_batches;
stats->num_batches += 1;
stats->drawn_indices += p->num_indices;
stats->drawn_vertices += p->num_vertices;
}
///////////////////////////////////////////////////////////////////////
//
// Flash 8 filter effects
//
static void *start_ps_constants(ID3D1X(Buffer) *buffer)
{
return map_buffer(gdraw->d3d_context, buffer, true);
}
static void end_ps_constants(ID3D1X(Buffer) *buffer)
{
unmap_buffer(gdraw->d3d_context, buffer);
gdraw->d3d_context->PSSetConstantBuffers(1, 1, &buffer);
}
static void set_pixel_constant(F32 *constant, F32 x, F32 y, F32 z, F32 w)
{
constant[0] = x;
constant[1] = y;
constant[2] = z;
constant[3] = w;
}
// caller sets up texture coordinates
static void do_screen_quad(gswf_recti *s, const F32 *tc, GDrawStats *stats)
{
ID3D1XContext *d3d = gdraw->d3d_context;
F32 px0 = (F32) s->x0, py0 = (F32) s->y0, px1 = (F32) s->x1, py1 = (F32) s->y1;
// generate vertex data
gswf_vertex_xyst *vert = (gswf_vertex_xyst *) start_write_dyn(&gdraw->dyn_vb, 4 * sizeof(gswf_vertex_xyst));
if (!vert)
return;
vert[0].x = px0; vert[0].y = py0; vert[0].s = tc[0]; vert[0].t = tc[1];
vert[1].x = px1; vert[1].y = py0; vert[1].s = tc[2]; vert[1].t = tc[1];
vert[2].x = px0; vert[2].y = py1; vert[2].s = tc[0]; vert[2].t = tc[3];
vert[3].x = px1; vert[3].y = py1; vert[3].s = tc[2]; vert[3].t = tc[3];
UINT offs = end_write_dyn(&gdraw->dyn_vb);
UINT stride = sizeof(gswf_vertex_xyst);
if (VertexVars *vvars = (VertexVars *) map_buffer(gdraw->d3d_context, gdraw->cb_vertex, true)) {
gdraw_PixelSpace(vvars->world[0]);
memcpy(vvars->x3d, gdraw->projmat, 12*sizeof(F32));
unmap_buffer(gdraw->d3d_context, gdraw->cb_vertex);
d3d->VSSetConstantBuffers(0, 1, &gdraw->cb_vertex);
set_vertex_shader(d3d, gdraw->vert[GDRAW_vformat_v2tc2].vshader);
d3d->IASetInputLayout(gdraw->inlayout[GDRAW_vformat_v2tc2]);
d3d->IASetVertexBuffers(0, 1, &gdraw->dyn_vb.buffer, &stride, &offs);
d3d->IASetPrimitiveTopology(D3D1X_(PRIMITIVE_TOPOLOGY_TRIANGLESTRIP));
d3d->Draw(4, 0);
d3d->IASetPrimitiveTopology(D3D1X_(PRIMITIVE_TOPOLOGY_TRIANGLELIST));
stats->nonzero_flags |= GDRAW_STATS_batches;
stats->num_batches += 1;
stats->drawn_indices += 6;
stats->drawn_vertices += 4;
}
}
static void manual_clear(gswf_recti *r, GDrawStats *stats)
{
ID3D1XContext *d3d = gdraw->d3d_context;
// go to known render state
d3d->OMSetBlendState(gdraw->blend_state[GDRAW_BLEND_none], four_zeros, ~0u);
d3d->OMSetDepthStencilState(gdraw->depth_state[0][0], 0);
gdraw->blend_mode = GDRAW_BLEND_none;
set_viewport_raw(0, 0, gdraw->frametex_width, gdraw->frametex_height);
set_projection_raw(0, gdraw->frametex_width, gdraw->frametex_height, 0);
set_pixel_shader(d3d, gdraw->clear_ps.pshader);
if (PixelCommonVars *pvars = (PixelCommonVars *) map_buffer(gdraw->d3d_context, gdraw->cb_ps_common, true)) {
memset(pvars, 0, sizeof(*pvars));
unmap_buffer(gdraw->d3d_context, gdraw->cb_ps_common);
d3d->PSSetConstantBuffers(0, 1, &gdraw->cb_ps_common);
do_screen_quad(r, four_zeros, stats);
}
}
static void gdraw_DriverBlurPass(GDrawRenderState *r, int taps, float *data, gswf_recti *s, float *tc, float /*height_max*/, float *clamp, GDrawStats *gstats)
{
set_texture(0, r->tex[0], false, GDRAW_WRAP_clamp);
set_pixel_shader(gdraw->d3d_context, gdraw->blur_prog[taps].pshader);
PixelParaBlur *para = (PixelParaBlur *) start_ps_constants(gdraw->cb_blur);
memcpy(para->clamp, clamp, 4 * sizeof(float));
memcpy(para->tap, data, taps * 4 * sizeof(float));
end_ps_constants(gdraw->cb_blur);
do_screen_quad(s, tc, gstats);
tag_resources(r->tex[0]);
}
static void gdraw_Colormatrix(GDrawRenderState *r, gswf_recti *s, float *tc, GDrawStats *stats)
{
if (!gdraw_TextureDrawBufferBegin(s, GDRAW_TEXTURE_FORMAT_rgba32, GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_color | GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_alpha, 0, stats))
return;
set_texture(0, r->tex[0], false, GDRAW_WRAP_clamp);
set_pixel_shader(gdraw->d3d_context, gdraw->colormatrix.pshader);
PixelParaColorMatrix *para = (PixelParaColorMatrix *) start_ps_constants(gdraw->cb_colormatrix);
memcpy(para->data, r->shader_data, 5 * 4 * sizeof(float));
end_ps_constants(gdraw->cb_colormatrix);
do_screen_quad(s, tc, stats);
tag_resources(r->tex[0]);
r->tex[0] = gdraw_TextureDrawBufferEnd(stats);
}
static gswf_recti *get_valid_rect(GDrawTexture *tex)
{
GDrawHandle *h = (GDrawHandle *) tex;
S32 n = (S32) (h - gdraw->rendertargets.handle);
assert(n >= 0 && n <= MAX_RENDER_STACK_DEPTH+1);
return &gdraw->rt_valid[n];
}
static void set_clamp_constant(F32 *constant, GDrawTexture *tex)
{
gswf_recti *s = get_valid_rect(tex);
// when we make the valid data, we make sure there is an extra empty pixel at the border
set_pixel_constant(constant,
(s->x0-0.5f) / gdraw->frametex_width,
(s->y0-0.5f) / gdraw->frametex_height,
(s->x1+0.5f) / gdraw->frametex_width,
(s->y1+0.5f) / gdraw->frametex_height);
}
static void gdraw_Filter(GDrawRenderState *r, gswf_recti *s, float *tc, int isbevel, GDrawStats *stats)
{
if (!gdraw_TextureDrawBufferBegin(s, GDRAW_TEXTURE_FORMAT_rgba32, GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_color | GDRAW_TEXTUREDRAWBUFFER_FLAGS_needs_alpha, NULL, stats))
return;
set_texture(0, r->tex[0], false, GDRAW_WRAP_clamp);
set_texture(1, r->tex[1], false, GDRAW_WRAP_clamp);
set_texture(2, r->tex[2], false, GDRAW_WRAP_clamp);
set_pixel_shader(gdraw->d3d_context, gdraw->filter_prog[isbevel][r->filter_mode].pshader);
PixelParaFilter *para = (PixelParaFilter *) start_ps_constants(gdraw->cb_filter);
set_clamp_constant(para->clamp0, r->tex[0]);
set_clamp_constant(para->clamp1, r->tex[1]);
set_pixel_constant(para->color, r->shader_data[0], r->shader_data[1], r->shader_data[2], r->shader_data[3]);
set_pixel_constant(para->color2, r->shader_data[8], r->shader_data[9], r->shader_data[10], r->shader_data[11]);
set_pixel_constant(para->tc_off, -r->shader_data[4] / (F32)gdraw->frametex_width, -r->shader_data[5] / (F32)gdraw->frametex_height, r->shader_data[6], 0);
end_ps_constants(gdraw->cb_filter);
do_screen_quad(s, tc, stats);
tag_resources(r->tex[0], r->tex[1], r->tex[2]);
r->tex[0] = gdraw_TextureDrawBufferEnd(stats);
}
static void RADLINK gdraw_FilterQuad(GDrawRenderState *r, S32 x0, S32 y0, S32 x1, S32 y1, GDrawStats *stats)
{
ID3D1XContext *d3d = gdraw->d3d_context;
F32 tc[4];
gswf_recti s;
// clip to tile boundaries
s.x0 = RR_MAX(x0, gdraw->tx0p);
s.y0 = RR_MAX(y0, gdraw->ty0p);
s.x1 = RR_MIN(x1, gdraw->tx0p + gdraw->tpw);
s.y1 = RR_MIN(y1, gdraw->ty0p + gdraw->tph);
if (s.x1 < s.x0 || s.y1 < s.y0)
return;
tc[0] = (s.x0 - gdraw->tx0p) / (F32) gdraw->frametex_width;
tc[1] = (s.y0 - gdraw->ty0p) / (F32) gdraw->frametex_height;
tc[2] = (s.x1 - gdraw->tx0p) / (F32) gdraw->frametex_width;
tc[3] = (s.y1 - gdraw->ty0p) / (F32) gdraw->frametex_height;
// clear to known render state
d3d->OMSetBlendState(gdraw->blend_state[GDRAW_BLEND_none], four_zeros, ~0u);
d3d->OMSetDepthStencilState(gdraw->depth_state[0][0], 0);
disable_scissor(0);
gdraw->blend_mode = GDRAW_BLEND_none;
if (r->blend_mode == GDRAW_BLEND_filter) {
switch (r->filter) {
case GDRAW_FILTER_blur: {
GDrawBlurInfo b;
gswf_recti bounds = *get_valid_rect(r->tex[0]);
gdraw_ShiftRect(&s, &s, -gdraw->tx0p, -gdraw->ty0p); // blur uses physical rendertarget coordinates
b.BlurPass = gdraw_DriverBlurPass;
b.w = gdraw->tpw;
b.h = gdraw->tph;
b.frametex_width = gdraw->frametex_width;
b.frametex_height = gdraw->frametex_height;
// blur needs to draw with multiple passes, so set up special state
gdraw->in_blur = true;
// do the blur
gdraw_Blur(&gdraw_funcs, &b, r, &s, &bounds, stats);
// restore the normal state
gdraw->in_blur = false;
set_viewport();
set_projection();
break;
}
case GDRAW_FILTER_colormatrix:
gdraw_Colormatrix(r, &s, tc, stats);
break;
case GDRAW_FILTER_dropshadow:
gdraw_Filter(r, &s, tc, 0, stats);
break;
case GDRAW_FILTER_bevel:
gdraw_Filter(r, &s, tc, 1, stats);
break;
default:
assert(0);
}
} else {
GDrawHandle *blend_tex = NULL;
// for crazy blend modes, we need to read back from the framebuffer
// and do the blending in the pixel shader. we do this with copies
// rather than trying to render-to-texture-all-along, because we want
// to be able to render over the user's existing framebuffer, which might
// not be a texture. note that this isn't optimal when MSAA is on!
F32 rescale1[4] = { 1.0f, 1.0f, 0.0f, 0.0f };
if (r->blend_mode == GDRAW_BLEND_special) {
ID3D1XContext *d3d = gdraw->d3d_context;
ID3D1X(Resource) *cur_rt_rsrc;
get_active_render_target()->GetResource(&cur_rt_rsrc);
if (gdraw->cur == gdraw->frame && gdraw->main_msaa) {
// source surface is main framebuffer and it uses MSAA. just resolve it first.
D3D1X_(SHADER_RESOURCE_VIEW_DESC) desc;
D3D1X_(TEXTURE2D_DESC) texdesc;
ID3D1X(Texture2D) *resolve_tex;
gdraw->main_resolve_target->GetDesc(&desc);
gdraw->main_resolve_target->GetResource((ID3D1X(Resource) **) &resolve_tex);
resolve_tex->GetDesc(&texdesc);
d3d->ResolveSubresource(resolve_tex, 0, cur_rt_rsrc, 0, desc.Format);
resolve_tex->Release();
stats->nonzero_flags |= GDRAW_STATS_blits;
stats->num_blits += 1;
stats->num_blit_pixels += texdesc.Width * texdesc.Height;
d3d->PSSetShaderResources(1, 1, &gdraw->main_resolve_target);
d3d->PSSetSamplers(1, 1, &gdraw->sampler_state[0][GDRAW_WRAP_clamp]);
// calculate texture coordinate remapping
rescale1[0] = gdraw->frametex_width / (F32) texdesc.Width;
rescale1[1] = gdraw->frametex_height / (F32) texdesc.Height;
rescale1[2] = (gdraw->vx - gdraw->tx0 + gdraw->tx0p) / (F32) texdesc.Width;
rescale1[3] = (gdraw->vy - gdraw->ty0 + gdraw->ty0p) / (F32) texdesc.Height;
} else {
D3D1X_(BOX) box = { 0,0,0,0,0,1 };
S32 dx = 0, dy = 0;
blend_tex = get_color_rendertarget(stats);
if (gdraw->cur != gdraw->frame)
box.right=gdraw->tpw, box.bottom=gdraw->tph;
else {
box.left=gdraw->vx, box.top=gdraw->vy, box.right=gdraw->vx+gdraw->tw, box.bottom=gdraw->vy+gdraw->th;
dx = gdraw->tx0 - gdraw->tx0p;
dy = gdraw->ty0 - gdraw->ty0p;
}
d3d->CopySubresourceRegion(blend_tex->handle.tex.d3d, 0, dx, dy, 0,
cur_rt_rsrc, 0, &box);
stats->nonzero_flags |= GDRAW_STATS_blits;
stats->num_blits += 1;
stats->num_blit_pixels += (box.right - box.left) * (box.bottom - box.top);
set_texture(1, (GDrawTexture *) blend_tex, false, GDRAW_WRAP_clamp);
}
cur_rt_rsrc->Release();
}
if (!set_renderstate_full(GDRAW_vformat_v2tc2, r, stats, rescale1))
return;
do_screen_quad(&s, tc, stats);
tag_resources(r->tex[0], r->tex[1]);
if (blend_tex)
gdraw_FreeTexture((GDrawTexture *) blend_tex, 0, stats);
}
}
///////////////////////////////////////////////////////////////////////
//
// Shaders and state
//
#include GDRAW_SHADER_FILE
static void destroy_shader(ProgramWithCachedVariableLocations *p)
{
if (p->pshader) {
p->pshader->Release();
p->pshader = NULL;
}
}
static ID3D1X(Buffer) *create_dynamic_buffer(U32 size, U32 bind)
{
D3D1X_(BUFFER_DESC) desc = { size, D3D1X_(USAGE_DYNAMIC), bind, D3D1X_(CPU_ACCESS_WRITE), 0 };
ID3D1X(Buffer) *buf = NULL;
HRESULT hr = gdraw->d3d_device->CreateBuffer(&desc, NULL, &buf);
if (FAILED(hr)) {
report_d3d_error(hr, "CreateBuffer", " creating dynamic vertex buffer");
buf = NULL;
}
return buf;
}
static void init_dyn_buffer(DynBuffer *buf, U32 size, U32 bind)
{
buf->buffer = create_dynamic_buffer(size, bind);
buf->size = size;
buf->write_pos = 0;
buf->alloc_pos = 0;
}
// These two functions are implemented by the D3D10- respectively D3D11-specific part.
static void create_pixel_shader(ProgramWithCachedVariableLocations *p, ProgramWithCachedVariableLocations *src);
static void create_vertex_shader(ProgramWithCachedVariableLocations *p, ProgramWithCachedVariableLocations *src);
static void create_all_shaders_and_state(void)
{
ID3D1X(Device) *d3d = gdraw->d3d_device;
HRESULT hr;
S32 i, j;
for (i=0; i < GDRAW_TEXTURE__count*3; ++i) create_pixel_shader(&gdraw->fprog[0][i], pshader_basic_arr + i);
for (i=0; i < GDRAW_BLENDSPECIAL__count; ++i) create_pixel_shader(&gdraw->exceptional_blend[i], pshader_exceptional_blend_arr + i);
for (i=0; i < 32; ++i) create_pixel_shader(&gdraw->filter_prog[0][i], pshader_filter_arr + i);
for (i=0; i < MAX_TAPS+1; ++i) create_pixel_shader(&gdraw->blur_prog[i], pshader_blur_arr + i);
create_pixel_shader(&gdraw->colormatrix, pshader_color_matrix_arr);
create_pixel_shader(&gdraw->clear_ps, pshader_manual_clear_arr);
for (i=0; i < GDRAW_vformat__basic_count; i++) {
ProgramWithCachedVariableLocations *vsh = vshader_vsd3d10_arr + i;
create_vertex_shader(&gdraw->vert[i], vsh);
HRESULT hr = d3d->CreateInputLayout(vformats[i].desc, vformats[i].nelem, vsh->bytecode, vsh->size, &gdraw->inlayout[i]);
if (FAILED(hr)) {
report_d3d_error(hr, "CreateInputLayout", "");
gdraw->inlayout[i] = NULL;
}
}
// create rasterizer state setups
for (i=0; i < 2; ++i) {
D3D1X_(RASTERIZER_DESC) raster_desc = { D3D1X_(FILL_SOLID), D3D1X_(CULL_NONE), FALSE, 0, 0.0f, 0.0f, TRUE, TRUE, FALSE, FALSE };
raster_desc.MultisampleEnable = i;
hr = d3d->CreateRasterizerState(&raster_desc, &gdraw->raster_state[i]);
if (FAILED(hr)) {
report_d3d_error(hr, "CreateRasterizerState", "");
return;
}
}
// create sampler state setups
static const D3D1X_(TEXTURE_ADDRESS_MODE) addrmode[ASSERT_COUNT(GDRAW_WRAP__count, 4)] = {
D3D1X_(TEXTURE_ADDRESS_CLAMP), // GDRAW_WRAP_clamp
D3D1X_(TEXTURE_ADDRESS_WRAP), // GDRAW_WRAP_repeat
D3D1X_(TEXTURE_ADDRESS_MIRROR), // GDRAW_WRAP_mirror
D3D1X_(TEXTURE_ADDRESS_CLAMP), // GDRAW_WRAP_clamp_to_border (unused for this renderer)
};
for (i=0; i < 2; ++i) {
for (j=0; j < GDRAW_WRAP__count; ++j) {
D3D1X_(SAMPLER_DESC) sampler_desc;
memset(&sampler_desc, 0, sizeof(sampler_desc));
sampler_desc.Filter = i ? D3D1X_(FILTER_MIN_LINEAR_MAG_MIP_POINT) : D3D1X_(FILTER_MIN_MAG_MIP_LINEAR);
sampler_desc.AddressU = addrmode[j];
sampler_desc.AddressV = addrmode[j];
sampler_desc.AddressW = D3D1X_(TEXTURE_ADDRESS_CLAMP);
sampler_desc.MaxAnisotropy = 1;
sampler_desc.MaxLOD = D3D1X_(FLOAT32_MAX);
hr = d3d->CreateSamplerState(&sampler_desc, &gdraw->sampler_state[i][j]);
if (FAILED(hr)) {
report_d3d_error(hr, "CreateSamplerState", "");
return;
}
}
}
// create blend stage setups
static struct blendspec {
BOOL blend;
D3D1X_(BLEND) src;
D3D1X_(BLEND) dst;
} blends[ASSERT_COUNT(GDRAW_BLEND__count, 6)] = {
FALSE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_ZERO), // GDRAW_BLEND_none
TRUE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_INV_SRC_ALPHA), // GDRAW_BLEND_alpha
TRUE, D3D1X_(BLEND_DEST_COLOR), D3D1X_(BLEND_INV_SRC_ALPHA), // GDRAW_BLEND_multiply
TRUE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_ONE), // GDRAW_BLEND_add
FALSE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_ZERO), // GDRAW_BLEND_filter
FALSE, D3D1X_(BLEND_ONE), D3D1X_(BLEND_ZERO), // GDRAW_BLEND_special
};
for (i=0; i < GDRAW_BLEND__count; ++i) {
gdraw->blend_state[i] = create_blend_state(d3d, blends[i].blend, blends[i].src, blends[i].dst);
if (!gdraw->blend_state[i])
return;
}
D3D1X_(BLEND_DESC) blend_desc;
memset(&blend_desc, 0, sizeof(blend_desc));
hr = d3d->CreateBlendState(&blend_desc, &gdraw->blend_no_color_write);
if (FAILED(hr)) {
report_d3d_error(hr, "CreateBlendState", "");
return;
}
// create depth/stencil setups
for (i=0; i < 2; ++i) {
for (j=0; j < 2; ++j) {
D3D1X_(DEPTH_STENCIL_DESC) depth_desc;
memset(&depth_desc, 0, sizeof(depth_desc));
depth_desc.DepthEnable = (i || j);
depth_desc.DepthWriteMask = i ? D3D1X_(DEPTH_WRITE_MASK_ALL) : D3D1X_(DEPTH_WRITE_MASK_ZERO);
depth_desc.DepthFunc = j ? D3D1X_(COMPARISON_LESS) : D3D1X_(COMPARISON_ALWAYS);
depth_desc.StencilEnable = FALSE;
hr = d3d->CreateDepthStencilState(&depth_desc, &gdraw->depth_state[i][j]);
if (FAILED(hr)) {
report_d3d_error(hr, "CreateDepthStencilState", "");
return;
}
}
}
// constant buffers
gdraw->cb_vertex = create_dynamic_buffer(sizeof(VertexVars), D3D1X_(BIND_CONSTANT_BUFFER));
gdraw->cb_ps_common = create_dynamic_buffer(sizeof(PixelCommonVars), D3D1X_(BIND_CONSTANT_BUFFER));
gdraw->cb_filter = create_dynamic_buffer(sizeof(PixelParaFilter), D3D1X_(BIND_CONSTANT_BUFFER));
gdraw->cb_colormatrix = create_dynamic_buffer(sizeof(PixelParaColorMatrix), D3D1X_(BIND_CONSTANT_BUFFER));
gdraw->cb_blur = create_dynamic_buffer(sizeof(PixelParaBlur), D3D1X_(BIND_CONSTANT_BUFFER));
// quad index buffer
assert(QUAD_IB_COUNT * 4 < 65535); // can't use more; we have 16-bit index buffers and 0xffff = primitive cut index
U16 *inds = (U16 *) IggyGDrawMalloc(QUAD_IB_COUNT * 6 * sizeof(U16));
if (inds) {
D3D1X_(BUFFER_DESC) bufdesc = { };
D3D1X_(SUBRESOURCE_DATA) data = { inds, 0, 0 };
bufdesc.ByteWidth = QUAD_IB_COUNT * 6 * sizeof(U16);
bufdesc.Usage = D3D1X_(USAGE_IMMUTABLE);
bufdesc.BindFlags = D3D1X_(BIND_INDEX_BUFFER);
for (U16 i=0; i < QUAD_IB_COUNT; i++) {
inds[i*6 + 0] = i*4 + 0;
inds[i*6 + 1] = i*4 + 1;
inds[i*6 + 2] = i*4 + 2;
inds[i*6 + 3] = i*4 + 0;
inds[i*6 + 4] = i*4 + 2;
inds[i*6 + 5] = i*4 + 3;
}
hr = gdraw->d3d_device->CreateBuffer(&bufdesc, &data, &gdraw->quad_ib);
if (FAILED(hr)) {
report_d3d_error(hr, "CreateBuffer", " for constants");
gdraw->quad_ib = NULL;
}
IggyGDrawFree(inds);
} else
gdraw->quad_ib = NULL;
}
static void destroy_all_shaders_and_state()
{
S32 i;
for (i=0; i < GDRAW_TEXTURE__count*3; ++i) destroy_shader(&gdraw->fprog[0][i]);
for (i=0; i < GDRAW_BLENDSPECIAL__count; ++i) destroy_shader(&gdraw->exceptional_blend[i]);
for (i=0; i < 32; ++i) destroy_shader(&gdraw->filter_prog[0][i]);
for (i=0; i < MAX_TAPS+1; ++i) destroy_shader(&gdraw->blur_prog[i]);
destroy_shader(&gdraw->colormatrix);
destroy_shader(&gdraw->clear_ps);
for (i=0; i < GDRAW_vformat__basic_count; i++) {
safe_release(gdraw->inlayout[i]);
destroy_shader(&gdraw->vert[i]);
}
for (i=0; i < 2; ++i) safe_release(gdraw->raster_state[i]);
for (i=0; i < GDRAW_WRAP__count*2; ++i) safe_release(gdraw->sampler_state[0][i]);
for (i=0; i < GDRAW_BLEND__count; ++i) safe_release(gdraw->blend_state[i]);
for (i=0; i < 2*2; ++i) safe_release(gdraw->depth_state[0][i]);
safe_release(gdraw->blend_no_color_write);
safe_release(gdraw->cb_vertex);
safe_release(gdraw->cb_ps_common);
safe_release(gdraw->cb_filter);
safe_release(gdraw->cb_colormatrix);
safe_release(gdraw->cb_blur);
safe_release(gdraw->quad_ib);
}
////////////////////////////////////////////////////////////////////////
//
// Create and tear-down the state
//
typedef struct
{
S32 num_handles;
S32 num_bytes;
} GDrawResourceLimit;
// These are the defaults limits used by GDraw unless the user specifies something else.
static GDrawResourceLimit gdraw_limits[GDRAW_D3D1X_(RESOURCE__count)] = {
MAX_RENDER_STACK_DEPTH + 1, 16*1024*1024, // RESOURCE_rendertarget
500, 16*1024*1024, // RESOURCE_texture
1000, 2*1024*1024, // RESOURCE_vertexbuffer
0, 256*1024, // RESOURCE_dynbuffer
};
static GDrawHandleCache *make_handle_cache(gdraw_resourcetype type)
{
S32 num_handles = gdraw_limits[type].num_handles;
S32 num_bytes = gdraw_limits[type].num_bytes;
GDrawHandleCache *cache = (GDrawHandleCache *) IggyGDrawMalloc(sizeof(GDrawHandleCache) + (num_handles - 1) * sizeof(GDrawHandle));
if (cache) {
gdraw_HandleCacheInit(cache, num_handles, num_bytes);
cache->is_vertex = (type == GDRAW_D3D1X_(RESOURCE_vertexbuffer));
}
return cache;
}
static void free_gdraw()
{
if (!gdraw) return;
if (gdraw->texturecache) IggyGDrawFree(gdraw->texturecache);
if (gdraw->vbufcache) IggyGDrawFree(gdraw->vbufcache);
IggyGDrawFree(gdraw);
gdraw = NULL;
}
static bool alloc_dynbuffer(U32 size)
{
// specified input size is vertex buffer size. determine sensible size for the
// corresponding index buffer. iggy always uses 16-bit indices and has three
// primary types of geometry it sends:
//
// 1. filled polygons. these are triangulated simple polygons and thus have
// roughly as many triangles as they have vertices. they use either 8- or
// 16-byte vertex formats; this makes a worst case of 6 bytes of indices
// for every 8 bytes of vertex data.
// 2. strokes and edge antialiasing. they use a 16-byte vertex format and
// worst-case write a "double quadstrip" which has 4 triangles for every
// 3 vertices, which means 24 bytes of index data for every 48 bytes
// of vertex data.
// 3. textured quads. they use a 16-byte vertex format, have exactly 2
// triangles for every 4 vertices, and use either a static index buffer
// (quad_ib) or a single triangle strip, so for our purposes they need no
// space to store indices at all.
//
// 1) argues for allocating index buffers at 3/4 the size of the corresponding
// vertex buffer, while 2) and 3) need 1/2 the size of the vertex buffer or less.
// 2) and 3) are the most common types of vertex data, while 1) is used only for
// morphed shapes and in certain cases when the RESOURCE_vertexbuffer pool is full.
// we just play it safe anyway and make sure we size the IB large enough to cover
// the worst case for 1). this is conservative, but it probably doesn't matter much.
U32 ibsize = (size * 3) / 4;
init_dyn_buffer(&gdraw->dyn_vb, size, D3D1X_(BIND_VERTEX_BUFFER));
init_dyn_buffer(&gdraw->dyn_ib, ibsize, D3D1X_(BIND_INDEX_BUFFER));
gdraw->max_quad_vert_count = RR_MIN(size / sizeof(gswf_vertex_xyst), QUAD_IB_COUNT * 4);
gdraw->max_quad_vert_count &= ~3; // must be multiple of four
return gdraw->dyn_vb.buffer != NULL && gdraw->dyn_ib.buffer != NULL;
}
int gdraw_D3D1X_(SetResourceLimits)(gdraw_resourcetype type, S32 num_handles, S32 num_bytes)
{
GDrawStats stats={0};
if (type == GDRAW_D3D1X_(RESOURCE_rendertarget)) // RT count is small and space is preallocated
num_handles = MAX_RENDER_STACK_DEPTH + 1;
assert(type >= GDRAW_D3D1X_(RESOURCE_rendertarget) && type < GDRAW_D3D1X_(RESOURCE__count));
assert(num_handles >= 0);
assert(num_bytes >= 0);
// nothing to do if the values are unchanged
if (gdraw_limits[type].num_handles == num_handles &&
gdraw_limits[type].num_bytes == num_bytes)
return 1;
gdraw_limits[type].num_handles = num_handles;
gdraw_limits[type].num_bytes = num_bytes;
// if no gdraw context created, there's nothing to worry about
if (!gdraw)
return 1;
// resize the appropriate pool
switch (type) {
case GDRAW_D3D1X_(RESOURCE_rendertarget):
flush_rendertargets(&stats);
gdraw_HandleCacheInit(&gdraw->rendertargets, num_handles, num_bytes);
return 1;
case GDRAW_D3D1X_(RESOURCE_texture):
if (gdraw->texturecache) {
gdraw_res_flush(gdraw->texturecache, &stats);
IggyGDrawFree(gdraw->texturecache);
}
gdraw->texturecache = make_handle_cache(GDRAW_D3D1X_(RESOURCE_texture));
return gdraw->texturecache != NULL;
case GDRAW_D3D1X_(RESOURCE_vertexbuffer):
if (gdraw->vbufcache) {
gdraw_res_flush(gdraw->vbufcache, &stats);
IggyGDrawFree(gdraw->vbufcache);
}
gdraw->vbufcache = make_handle_cache(GDRAW_D3D1X_(RESOURCE_vertexbuffer));
return gdraw->vbufcache != NULL;
case GDRAW_D3D1X_(RESOURCE_dynbuffer):
unbind_resources();
safe_release(gdraw->dyn_vb.buffer);
safe_release(gdraw->dyn_ib.buffer);
return alloc_dynbuffer(num_bytes);
default:
return 0;
}
}
static GDrawFunctions *create_context(ID3D1XDevice *dev, ID3D1XContext *ctx, S32 w, S32 h)
{
gdraw = (GDraw *) IggyGDrawMalloc(sizeof(*gdraw));
if (!gdraw) return NULL;
memset(gdraw, 0, sizeof(*gdraw));
gdraw->frametex_width = w;
gdraw->frametex_height = h;
gdraw->d3d_device = dev;
gdraw->d3d_context = ctx;
gdraw->texturecache = make_handle_cache(GDRAW_D3D1X_(RESOURCE_texture));
gdraw->vbufcache = make_handle_cache(GDRAW_D3D1X_(RESOURCE_vertexbuffer));
gdraw_HandleCacheInit(&gdraw->rendertargets, gdraw_limits[GDRAW_D3D1X_(RESOURCE_rendertarget)].num_handles, gdraw_limits[GDRAW_D3D1X_(RESOURCE_rendertarget)].num_bytes);
if (!gdraw->texturecache || !gdraw->vbufcache || !alloc_dynbuffer(gdraw_limits[GDRAW_D3D1X_(RESOURCE_dynbuffer)].num_bytes)) {
free_gdraw();
return NULL;
}
create_all_shaders_and_state();
gdraw_funcs.SetViewSizeAndWorldScale = gdraw_SetViewSizeAndWorldScale;
gdraw_funcs.GetInfo = gdraw_GetInfo;
gdraw_funcs.DescribeTexture = gdraw_DescribeTexture;
gdraw_funcs.DescribeVertexBuffer = gdraw_DescribeVertexBuffer;
gdraw_funcs.RenderingBegin = gdraw_RenderingBegin;
gdraw_funcs.RenderingEnd = gdraw_RenderingEnd;
gdraw_funcs.RenderTileBegin = gdraw_RenderTileBegin;
gdraw_funcs.RenderTileEnd = gdraw_RenderTileEnd;
gdraw_funcs.TextureDrawBufferBegin = gdraw_TextureDrawBufferBegin;
gdraw_funcs.TextureDrawBufferEnd = gdraw_TextureDrawBufferEnd;
gdraw_funcs.DrawIndexedTriangles = gdraw_DrawIndexedTriangles;
gdraw_funcs.FilterQuad = gdraw_FilterQuad;
gdraw_funcs.SetAntialiasTexture = gdraw_SetAntialiasTexture;
gdraw_funcs.ClearStencilBits = gdraw_ClearStencilBits;
gdraw_funcs.ClearID = gdraw_ClearID;
gdraw_funcs.MakeTextureBegin = gdraw_MakeTextureBegin;
gdraw_funcs.MakeTextureMore = gdraw_MakeTextureMore;
gdraw_funcs.MakeTextureEnd = gdraw_MakeTextureEnd;
gdraw_funcs.UpdateTextureBegin = gdraw_UpdateTextureBegin;
gdraw_funcs.UpdateTextureRect = gdraw_UpdateTextureRect;
gdraw_funcs.UpdateTextureEnd = gdraw_UpdateTextureEnd;
gdraw_funcs.FreeTexture = gdraw_FreeTexture;
gdraw_funcs.TryToLockTexture = gdraw_TryToLockTexture;
gdraw_funcs.MakeTextureFromResource = (gdraw_make_texture_from_resource *) gdraw_D3D1X_(MakeTextureFromResource);
gdraw_funcs.FreeTextureFromResource = gdraw_D3D1X_(DestroyTextureFromResource);
gdraw_funcs.MakeVertexBufferBegin = gdraw_MakeVertexBufferBegin;
gdraw_funcs.MakeVertexBufferMore = gdraw_MakeVertexBufferMore;
gdraw_funcs.MakeVertexBufferEnd = gdraw_MakeVertexBufferEnd;
gdraw_funcs.TryToLockVertexBuffer = gdraw_TryLockVertexBuffer;
gdraw_funcs.FreeVertexBuffer = gdraw_FreeVertexBuffer;
gdraw_funcs.UnlockHandles = gdraw_UnlockHandles;
gdraw_funcs.SetTextureUniqueID = gdraw_SetTextureUniqueID;
gdraw_funcs.Set3DTransform = gdraw_Set3DTransform;
return &gdraw_funcs;
}
void gdraw_D3D1X_(DestroyContext)(void)
{
if (gdraw && gdraw->d3d_device) {
GDrawStats stats={0};
clear_renderstate();
stencil_state_cache_clear();
destroy_all_shaders_and_state();
safe_release(gdraw->aa_tex);
safe_release(gdraw->aa_tex_view);
safe_release(gdraw->dyn_vb.buffer);
safe_release(gdraw->dyn_ib.buffer);
flush_rendertargets(&stats);
if (gdraw->texturecache) gdraw_res_flush(gdraw->texturecache, &stats);
if (gdraw->vbufcache) gdraw_res_flush(gdraw->vbufcache, &stats);
gdraw->d3d_device = NULL;
}
free_gdraw();
}
void gdraw_D3D1X_(SetErrorHandler)(void (__cdecl *error_handler)(HRESULT hr))
{
if (gdraw)
gdraw->error_handler = error_handler;
}
void gdraw_D3D1X_(PreReset)(void)
{
if (!gdraw) return;
GDrawStats stats={0};
flush_rendertargets(&stats);
// we may end up resizing the frame buffer
gdraw->frametex_width = 0;
gdraw->frametex_height = 0;
}
void gdraw_D3D1X_(PostReset)(void)
{
// maybe re-create rendertargets right now?
}
void RADLINK gdraw_D3D1X_(BeginCustomDraw)(IggyCustomDrawCallbackRegion * region, F32 mat[4][4])
{
clear_renderstate();
gdraw_GetObjectSpaceMatrix(mat[0], region->o2w, gdraw->projection, 0, 0);
}
void RADLINK gdraw_D3D1X_(BeginCustomDraw_4J)(IggyCustomDrawCallbackRegion * region, F32 mat[16])
{
clear_renderstate();
gdraw_GetObjectSpaceMatrix(mat, region->o2w, gdraw->projection, 0, 0);
}
void RADLINK gdraw_D3D1X_(CalculateCustomDraw_4J)(IggyCustomDrawCallbackRegion * region, F32 mat[16])
{
gdraw_GetObjectSpaceMatrix(mat, region->o2w, gdraw->projection, 0, 0);
}
void RADLINK gdraw_D3D1X_(EndCustomDraw)(IggyCustomDrawCallbackRegion * /*region*/)
{
GDrawStats stats={};
set_common_renderstate();
set_viewport();
set_render_target(&stats);
}
void RADLINK gdraw_D3D1X_(GetResourceUsageStats)(gdraw_resourcetype type, S32 *handles_used, S32 *bytes_used)
{
GDrawHandleCache *cache;
switch (type) {
case GDRAW_D3D1X_(RESOURCE_rendertarget): cache = &gdraw->rendertargets; break;
case GDRAW_D3D1X_(RESOURCE_texture): cache = gdraw->texturecache; break;
case GDRAW_D3D1X_(RESOURCE_vertexbuffer): cache = gdraw->vbufcache; break;
case GDRAW_D3D1X_(RESOURCE_dynbuffer): *handles_used = 0; *bytes_used = gdraw->last_dyn_maxalloc; return;
default: cache = NULL; break;
}
*handles_used = *bytes_used = 0;
if (cache) {
S32 i;
U64 frame = gdraw->frame_counter;
for (i=0; i < cache->max_handles; ++i)
if (cache->handle[i].bytes && cache->handle[i].owner && cache->handle[i].fence.value == frame) {
*handles_used += 1;
*bytes_used += cache->handle[i].bytes;
}
}
}
static S32 num_pixels(S32 w, S32 h, S32 mipmaps)
{
S32 k, pixels=0;
for (k=0; k < mipmaps; ++k) {
pixels += w*h*2;
w = (w>>1); w += !w;
h = (h>>1); h += !h;
}
return pixels;
}
GDrawTexture * RADLINK gdraw_D3D1X_(MakeTextureFromResource)(U8 *resource_file, S32 /*len*/, IggyFileTextureRaw *texture)
{
const char *failed_call="";
U8 *free_data = 0;
GDrawTexture *t=0;
S32 width, height, mipmaps, size, blk;
ID3D1X(Texture2D) *tex=0;
ID3D1X(ShaderResourceView) *view=0;
DXGI_FORMAT d3dfmt;
D3D1X_(SUBRESOURCE_DATA) mipdata[24] = { 0 };
S32 k;
HRESULT hr = S_OK;
width = texture->w;
height = texture->h;
mipmaps = texture->mipmaps;
blk = 1;
D3D1X_(TEXTURE2D_DESC) desc = { static_cast<U32>(width), static_cast<U32>(height), static_cast<U32>(mipmaps), 1U, DXGI_FORMAT_UNKNOWN, { 1, 0 },
D3D1X_(USAGE_IMMUTABLE), D3D1X_(BIND_SHADER_RESOURCE), 0U, 0U };
bool done = false;
switch (texture->format) {
case IFT_FORMAT_rgba_8888 : size= 4; d3dfmt = DXGI_FORMAT_R8G8B8A8_UNORM; break;
case IFT_FORMAT_DXT1 : size= 8; d3dfmt = DXGI_FORMAT_BC1_UNORM; blk = 4; break;
case IFT_FORMAT_DXT3 : size=16; d3dfmt = DXGI_FORMAT_BC2_UNORM; blk = 4; break;
case IFT_FORMAT_DXT5 : size=16; d3dfmt = DXGI_FORMAT_BC3_UNORM; blk = 4; break;
default: {
IggyGDrawSendWarning(NULL, "GDraw .iggytex raw texture format %d not supported by hardware", texture->format);
done = true;
}
}
if (!done) {
desc.Format = d3dfmt;
U8 *data = resource_file + texture->file_offset;
if (texture->format == IFT_FORMAT_i_8 || texture->format == IFT_FORMAT_i_4) {
// convert from intensity to luma+alpha
S32 i;
S32 total_size = 2 * num_pixels(width,height,mipmaps);
free_data = (U8 *) IggyGDrawMalloc(total_size);
if (!free_data) {
IggyGDrawSendWarning(NULL, "GDraw out of memory to store texture data to pass to D3D for %d x %d texture", width, height);
done = true;
} else {
U8 *cur = free_data;
for (k=0; k < mipmaps; ++k) {
S32 w = RR_MAX(width >> k, 1);
S32 h = RR_MAX(height >> k, 1);
for (i=0; i < w*h; ++i) {
cur[0] = cur[1] = *data++;
cur += 2;
}
}
data = free_data;
}
}
if (!done) {
for (k=0; k < mipmaps; ++k) {
S32 w = RR_MAX(width >> k, 1);
S32 h = RR_MAX(height >> k, 1);
S32 blkw = (w + blk-1) / blk;
S32 blkh = (h + blk-1) / blk;
mipdata[k].pSysMem = data;
mipdata[k].SysMemPitch = blkw * size;
data += blkw * blkh * size;
}
failed_call = "CreateTexture2D";
hr = gdraw->d3d_device->CreateTexture2D(&desc, mipdata, &tex);
if (!FAILED(hr)) {
failed_call = "CreateShaderResourceView for texture creation";
hr = gdraw->d3d_device->CreateShaderResourceView(tex, NULL, &view);
if (!FAILED(hr))
t = gdraw_D3D1X_(WrappedTextureCreate)(view);
}
}
}
if (FAILED(hr)) {
report_d3d_error(hr, failed_call, "");
}
if (free_data)
IggyGDrawFree(free_data);
if (!t) {
if (view)
view->Release();
if (tex)
tex->Release();
} else {
((GDrawHandle *) t)->handle.tex.d3d = tex;
}
return t;
}
void RADLINK gdraw_D3D1X_(DestroyTextureFromResource)(GDrawTexture *tex)
{
GDrawHandle *h = (GDrawHandle *) tex;
safe_release(h->handle.tex.d3d_view);
safe_release(h->handle.tex.d3d);
gdraw_D3D1X_(WrappedTextureDestroy)(tex);
}