Compare commits

...

28 Commits

Author SHA1 Message Date
b90b5e5725 Merge branch 'motion-track-01' 2025-09-17 11:42:15 +02:00
ed6f809029 Refactor marker navigation in VideoEditor to utilize tracking points
This commit enhances the VideoEditor class by updating the marker navigation methods to focus on tracking points instead of cut markers. The new methods, jump_to_previous_marker and jump_to_next_marker, now utilize a sorted list of frames with tracking points, improving navigation efficiency. Additionally, the documentation has been updated to reflect these changes, providing clearer instructions for users on how to navigate using tracking markers.
2025-09-17 11:41:45 +02:00
8a7e2609c5 Add marker navigation functionality to VideoEditor
This commit introduces keyboard controls for jumping to previous and next markers (cut start or end) in the VideoEditor class. The new functionality enhances user navigation during video editing, allowing for more efficient management of cut points. This addition improves the overall editing experience by providing quick access to key markers in the timeline.
2025-09-17 11:33:40 +02:00
dd1bc12667 Add exact frame seeking functionality to VideoEditor
This commit introduces a new method, seek_video_exact_frame, in the VideoEditor class, allowing users to seek video by exactly one frame, independent of the seek multiplier. The functionality is integrated with keyboard controls for navigating to the previous and next frames, enhancing precision in video editing. This addition improves user control and responsiveness during frame navigation.
2025-09-17 11:23:48 +02:00
9c14249f88 Update VideoEditor configuration for seek multiplier settings
This commit modifies the seek multiplier settings in the VideoEditor class, increasing the SEEK_MULTIPLIER_INCREMENT from 2.0 to 4.0 and expanding the MAX_SEEK_MULTIPLIER from 100.0 to 1000.0. These changes enhance the flexibility and responsiveness of the seeking functionality during video editing.
2025-09-17 11:18:35 +02:00
47ec7fed04 Enhance VideoEditor with previous and next frame tracking point visualization
This commit adds functionality to the VideoEditor class to render tracking points from the previous and next frames with 50% alpha blending. Red circles indicate previous frame points, while green circles represent next frame points, improving the visual feedback during video editing. Additionally, the feedback message duration has been reduced for a more responsive user experience.
2025-09-17 10:42:54 +02:00
e80278a2dd Refactor frame processing in VideoEditor to enhance cropping and rotation handling
This commit updates the frame processing logic in the VideoEditor class to apply rotation before cropping, ensuring accurate handling of frames in rotated space. It introduces effective cropping that accommodates out-of-bounds scenarios by padding with black, improving the visual output during editing. Additionally, debug statements have been refined to provide clearer information on output dimensions and effective crop details, enhancing the overall user experience in video editing.
2025-09-17 09:00:14 +02:00
f1d4145e43 Implement unified display parameters in VideoEditor for improved cropping and zoom handling
This commit introduces a new method, _get_display_params, in the VideoEditor class to centralize the calculation of display parameters, including effective crop dimensions, offsets, and scaling factors. The coordinate mapping methods have been updated to utilize these unified parameters, enhancing the accuracy of point mapping between original and screen coordinates during zoom and cropping operations. This refactor improves code readability and maintainability while ensuring a consistent user experience in video editing.
2025-09-17 08:06:52 +02:00
1d987a341a Refactor visibility calculations in VideoEditor to ensure proper cropping bounds
This commit updates the visibility width and height calculations in the VideoEditor class to use the minimum of the new dimensions and the window dimensions when cropping due to zoom. This change enhances the accuracy of the displayed area during zoom operations, ensuring that the canvas scaling is correctly applied and improving the overall user experience in video editing.
2025-09-17 01:56:16 +02:00
47ce52da37 Refactor coordinate mapping in VideoEditor to improve effective crop handling
This commit updates the coordinate mapping methods in the VideoEditor class to utilize the effective crop dimensions more accurately. It clarifies the calculations for mapping points between rotated frame coordinates and screen coordinates, ensuring consistent behavior during zoom and cropping operations. Additionally, comments have been refined for better understanding of the effective crop application, enhancing code readability and maintainability.
2025-09-17 01:54:04 +02:00
6c86271428 Refactor cropping logic in VideoEditor to utilize effective crop rectangle
This commit updates the cropping functionality in the VideoEditor class to use an effective crop rectangle derived from the current frame. It ensures that the crop is applied correctly within the bounds of the processed frame, enhancing the accuracy of cropping after rotation. Additionally, a visual outline of the effective crop is drawn on the canvas for debugging purposes, improving the user experience during video editing.
2025-09-17 01:51:56 +02:00
d0d2f66b11 Enhance effective crop rectangle calculation in VideoEditor for motion tracking
This commit updates the _get_effective_crop_rect_for_frame method to improve the calculation of the crop rectangle in rotated frame coordinates, incorporating tracking follow functionality. It ensures that the crop rectangle is accurately centered on the interpolated tracking position when motion tracking is enabled. Additionally, comments have been clarified to reflect the use of the effective crop, enhancing code readability and maintainability.
2025-09-17 01:48:49 +02:00
eeaeff6fe0 Enhance tracking point management in VideoEditor with rotated frame coordinates
This commit updates the VideoEditor class to store and manage tracking points in rotated frame coordinates, improving accuracy in overlay rendering and user interactions. It introduces a new method for mapping rotated coordinates to screen space and modifies existing methods to ensure consistent handling of coordinates throughout the editing process. These changes enhance the overall functionality and user experience when working with motion tracking in video editing.
2025-09-17 01:44:58 +02:00
d478b28e0d Refactor cropping and coordinate mapping in VideoEditor to support rotation
This commit modifies the cropping functionality to apply transformations in rotated frame coordinates, ensuring accurate cropping after rotation. It introduces a new method for mapping screen coordinates back to rotated frame coordinates, enhancing the overall cropping experience. Additionally, debug statements are added for better tracking of crop operations, improving the debugging process during development.
2025-09-17 01:42:08 +02:00
b440da3094 Refactor coordinate mapping in VideoEditor for improved zoom and rotation handling
This commit enhances the _map_original_to_screen and _map_screen_to_original methods by clarifying the calculations for zoom and rotation. It introduces new variables for better readability and ensures accurate mapping of coordinates, including adjustments for display offsets. The changes streamline the processing of frame dimensions and improve the overall functionality of the video editing experience.
2025-09-17 01:19:12 +02:00
fdf7d98850 Add motion tracking functionality to VideoEditor
This commit introduces motion tracking capabilities, allowing users to add and remove tracking points on video frames. The tracking state is managed with new attributes, and the crop functionality is enhanced to follow the tracked motion. Additionally, the user interface is updated to reflect the tracking status, and keyboard shortcuts are added for toggling tracking and clearing points. This feature improves the editing experience by enabling dynamic cropping based on motion analysis.
2025-09-17 01:14:26 +02:00
3ee5c1bddc Update spec 2025-09-17 00:18:40 +02:00
1d1d113a92 Refactor tracking point management in VideoEditor and MotionTracker to ensure accurate display coordinates
This commit modifies the handling of tracking points in the VideoEditor and MotionTracker classes by removing the storage of display coordinates. Instead, display coordinates are now calculated dynamically during rendering, ensuring accuracy regardless of crop or rotation states. The changes enhance the consistency of point transformations and improve logging for better debugging and verification of coordinate accuracy during mouse interactions.
2025-09-16 22:07:16 +02:00
e162e4fe92 Refactor tracking point management in VideoEditor and MotionTracker to ensure accurate display coordinates
This commit modifies the handling of tracking points in the VideoEditor and MotionTracker classes by removing the storage of display coordinates. Instead, display coordinates are now calculated dynamically during rendering, ensuring accuracy regardless of crop or rotation states. The changes enhance the consistency of point transformations and improve logging for better debugging and verification of coordinate accuracy during mouse interactions.
2025-09-16 21:34:17 +02:00
cd86cfc9f2 Enhance tracking point management in VideoEditor and MotionTracker with dual coordinate storage
This commit introduces a new TrackingPoint class to encapsulate both original and display coordinates for tracking points, improving the accuracy and consistency of point transformations. The VideoEditor class has been updated to utilize this new structure, allowing for better handling of tracking points during video editing. Additionally, logging has been enhanced to provide clearer insights into the addition and processing of tracking points, while redundant verification steps have been removed for efficiency. This change streamlines the tracking process and improves the overall user experience.
2025-09-16 21:33:28 +02:00
33a553c092 Refine VideoEditor point transformation methods with enhanced consistency and logging
This commit improves the point transformation and untransformation methods in the VideoEditor class by ensuring they match the applied transformations precisely. It adds checks for null points and current display frames, enhancing robustness. Additionally, detailed logging has been introduced to track coordinate adjustments during cropping and transformations, aiding in debugging and ensuring consistent behavior across the coordinate mapping process.
2025-09-16 20:58:54 +02:00
2979dca40a Refine VideoEditor point transformation and crop handling with enhanced logging
This commit improves the point transformation methods in the VideoEditor class by incorporating interpolated positions for cropping and refining the handling of coordinates during transformations. It adds detailed logging to track the adjustments made to crop centers and the verification of transformed points, ensuring better debugging and visibility into the state of the video editing process. Additionally, it enhances bounds checking for transformed points, maintaining consistency with the original frame dimensions.
2025-09-16 20:38:16 +02:00
cb097c55f1 Enhance VideoEditor with improved point transformation, bounds checking, and debugging features
This commit refines the point transformation methods in the VideoEditor class, ensuring coordinates are validated against frame dimensions and crop areas. It adds detailed logging for point transformations and mouse interactions, improving visibility into the state of tracking points. Additionally, it introduces debug features for visualizing crop rectangles and point indices, enhancing the debugging experience during video editing.
2025-09-16 20:32:33 +02:00
70364d0458 Update .gitignore and enhance VideoEditor with improved crop handling and logging
This commit adds a new entry to the .gitignore file to exclude log files. In the VideoEditor class, it refines the crop position adjustment logic to calculate the center of the crop rectangle before applying offsets, ensuring more accurate positioning. Additionally, it enhances logging throughout the point transformation and tracking processes, providing better insights into the state of tracking points and their visibility relative to the crop area.
2025-09-16 20:24:20 +02:00
c88c2cc354 Enhance VideoEditor and MotionTracker with improved logging and crop handling
This commit adds detailed logging to the VideoEditor and MotionTracker classes, providing insights into the transformations and adjustments made during point processing and cropping. It refines the crop position adjustment logic to ensure offsets are only applied when necessary and enhances the visibility of tracking points based on their position relative to the crop area. Additionally, it improves the handling of motion tracking toggling, ensuring a default crop rect is created when needed, thus enhancing the overall user experience during video editing.
2025-09-16 20:17:54 +02:00
9085a82bdd Enhance VideoEditor with improved point transformation and tracking logic
This commit refines the point transformation process in the VideoEditor class by ensuring coordinates are converted to floats and validating their positions relative to the crop area. It also updates the right-click event handling to accurately convert display coordinates to original frame coordinates, allowing for better interaction with tracking points. Additionally, the MotionTracker class is modified to set a default zoom center based on the crop rect if none is provided, improving the tracking functionality.
2025-09-16 20:04:34 +02:00
85891a5f99 Add motion tracking functionality to VideoEditor
This commit introduces a new MotionTracker class for handling motion tracking during video editing. The VideoEditor class has been updated to integrate motion tracking features, including adding and removing tracking points, interpolating positions, and applying tracking offsets during cropping. The user can toggle motion tracking and clear tracking points via keyboard shortcuts. Additionally, the state management has been enhanced to save and load motion tracking data, improving the overall editing experience.
2025-09-16 19:56:58 +02:00
66b23834fd Add spec file 2025-09-16 19:46:16 +02:00
4 changed files with 844 additions and 162 deletions

1
.gitignore vendored
View File

@@ -1,3 +1,4 @@
__pycache__
croppa/build/lib
croppa/croppa.egg-info
*.log

View File

@@ -7,10 +7,10 @@ from pathlib import Path
from typing import List
import time
import re
import json
import threading
import queue
import json
import subprocess
import queue
import ctypes
class Cv2BufferedCap:
@@ -467,9 +467,9 @@ class VideoEditor:
MAX_PLAYBACK_SPEED = 10.0
# Seek multiplier configuration
SEEK_MULTIPLIER_INCREMENT = 2.0
SEEK_MULTIPLIER_INCREMENT = 4.0
MIN_SEEK_MULTIPLIER = 1.0
MAX_SEEK_MULTIPLIER = 100.0
MAX_SEEK_MULTIPLIER = 1000.0
# Auto-repeat seeking configuration
AUTO_REPEAT_DISPLAY_RATE = 1.0
@@ -581,7 +581,7 @@ class VideoEditor:
# Feedback message state
self.feedback_message = ""
self.feedback_message_time = None
self.feedback_message_duration = 0.5 # seconds to show message
self.feedback_message_duration = 0.2 # seconds to show message
# Crop adjustment settings
self.crop_size_step = self.CROP_SIZE_STEP
@@ -601,6 +601,10 @@ class VideoEditor:
self.cached_frame_number = None
self.cached_transform_hash = None
# Motion tracking state
self.tracking_points = {} # {frame_number: [(x, y), ...]} in original frame coords
self.tracking_enabled = False
# Project view mode
self.project_view_mode = False
self.project_view = None
@@ -643,7 +647,9 @@ class VideoEditor:
'display_offset': self.display_offset,
'playback_speed': getattr(self, 'playback_speed', 1.0),
'seek_multiplier': getattr(self, 'seek_multiplier', 1.0),
'is_playing': getattr(self, 'is_playing', False)
'is_playing': getattr(self, 'is_playing', False),
'tracking_enabled': self.tracking_enabled,
'tracking_points': {str(k): v for k, v in self.tracking_points.items()}
}
with open(state_file, 'w') as f:
@@ -719,6 +725,12 @@ class VideoEditor:
if 'is_playing' in state:
self.is_playing = state['is_playing']
print(f"Loaded is_playing: {self.is_playing}")
if 'tracking_enabled' in state:
self.tracking_enabled = state['tracking_enabled']
print(f"Loaded tracking_enabled: {self.tracking_enabled}")
if 'tracking_points' in state and isinstance(state['tracking_points'], dict):
self.tracking_points = {int(k): v for k, v in state['tracking_points'].items()}
print(f"Loaded tracking_points: {sum(len(v) for v in self.tracking_points.values())} points")
# Validate cut markers against current video length
if self.cut_start_frame is not None and self.cut_start_frame >= self.total_frames:
@@ -994,6 +1006,14 @@ class VideoEditor:
frames = direction * int(base_frames * self.seek_multiplier)
self.seek_video(frames)
def seek_video_exact_frame(self, direction: int):
"""Seek video by exactly 1 frame, unaffected by seek multiplier"""
if self.is_image_mode:
return
frames = direction # Always exactly 1 frame
self.seek_video(frames)
def start_auto_repeat_seek(self, direction: int, shift_pressed: bool, ctrl_pressed: bool):
"""Start auto-repeat seeking"""
if self.is_image_mode:
@@ -1037,6 +1057,51 @@ class VideoEditor:
self.current_frame = max(0, min(frame_number, self.total_frames - 1))
self.load_current_frame()
def _get_sorted_markers(self):
"""Return sorted unique marker list [cut_start_frame, cut_end_frame] as ints within bounds."""
markers = []
for m in (self.cut_start_frame, self.cut_end_frame):
if isinstance(m, int):
markers.append(m)
if not markers:
return []
# Clamp and dedupe
clamped = set(max(0, min(m, self.total_frames - 1)) for m in markers)
return sorted(clamped)
def jump_to_previous_marker(self):
"""Jump to the previous tracking marker (frame with tracking points)."""
if self.is_image_mode:
return
self.stop_auto_repeat_seek()
tracking_frames = sorted(k for k, v in self.tracking_points.items() if v)
if not tracking_frames:
print("DEBUG: No tracking markers; prev jump ignored")
return
current = self.current_frame
candidates = [f for f in tracking_frames if f < current]
target = candidates[-1] if candidates else tracking_frames[-1]
print(f"DEBUG: Jump prev tracking from {current} -> {target}; tracking_frames={tracking_frames}")
self.seek_to_frame(target)
def jump_to_next_marker(self):
"""Jump to the next tracking marker (frame with tracking points)."""
if self.is_image_mode:
return
self.stop_auto_repeat_seek()
tracking_frames = sorted(k for k, v in self.tracking_points.items() if v)
if not tracking_frames:
print("DEBUG: No tracking markers; next jump ignored")
return
current = self.current_frame
for f in tracking_frames:
if f > current:
print(f"DEBUG: Jump next tracking from {current} -> {f}; tracking_frames={tracking_frames}")
self.seek_to_frame(f)
return
print(f"DEBUG: Jump next tracking wrap from {current} -> {tracking_frames[0]}; tracking_frames={tracking_frames}")
self.seek_to_frame(tracking_frames[0])
def advance_frame(self) -> bool:
"""Advance to next frame - handles playback speed and marker looping"""
if not self.is_playing:
@@ -1087,22 +1152,19 @@ class VideoEditor:
# Apply brightness/contrast first (to original frame for best quality)
processed_frame = self.apply_brightness_contrast(processed_frame)
# Apply crop
if self.crop_rect:
x, y, w, h = self.crop_rect
x, y, w, h = int(x), int(y), int(w), int(h)
# Ensure crop is within frame bounds
x = max(0, min(x, processed_frame.shape[1] - 1))
y = max(0, min(y, processed_frame.shape[0] - 1))
w = min(w, processed_frame.shape[1] - x)
h = min(h, processed_frame.shape[0] - y)
if w > 0 and h > 0:
processed_frame = processed_frame[y : y + h, x : x + w]
# Apply rotation
# Apply rotation first so crop_rect is in ROTATED frame coordinates
if self.rotation_angle != 0:
processed_frame = self.apply_rotation(processed_frame)
# Apply crop (interpreted in rotated frame coordinates) using EFFECTIVE rect
eff_x, eff_y, eff_w, eff_h = self._get_effective_crop_rect_for_frame(getattr(self, 'current_frame', 0))
if eff_w > 0 and eff_h > 0:
eff_x = max(0, min(eff_x, processed_frame.shape[1] - 1))
eff_y = max(0, min(eff_y, processed_frame.shape[0] - 1))
eff_w = min(eff_w, processed_frame.shape[1] - eff_x)
eff_h = min(eff_h, processed_frame.shape[0] - eff_y)
processed_frame = processed_frame[eff_y : eff_y + eff_h, eff_x : eff_x + eff_w]
# Apply zoom
if self.zoom_factor != 1.0:
height, width = processed_frame.shape[:2]
@@ -1129,6 +1191,233 @@ class VideoEditor:
return processed_frame
# --- Motion tracking helpers ---
def _get_effective_crop_rect_for_frame(self, frame_number):
"""Return EFFECTIVE crop_rect in ROTATED frame coords for this frame (applies tracking follow)."""
# Rotated base dims
if self.rotation_angle in (90, 270):
rot_w, rot_h = self.frame_height, self.frame_width
else:
rot_w, rot_h = self.frame_width, self.frame_height
# Default full-frame
if not self.crop_rect:
return (0, 0, rot_w, rot_h)
x, y, w, h = map(int, self.crop_rect)
# Tracking follow: center crop on interpolated rotated position
if self.tracking_enabled:
pos = self._get_interpolated_tracking_position(frame_number)
if pos:
cx, cy = pos
x = int(round(cx - w / 2))
y = int(round(cy - h / 2))
# Clamp in rotated space
x = max(0, min(x, rot_w - 1))
y = max(0, min(y, rot_h - 1))
w = min(w, rot_w - x)
h = min(h, rot_h - y)
return (x, y, w, h)
def _get_interpolated_tracking_position(self, frame_number):
"""Linear interpolation in ROTATED frame coords. Returns (rx, ry) or None."""
if not self.tracking_points:
return None
frames = sorted(self.tracking_points.keys())
if not frames:
return None
if frame_number in self.tracking_points and self.tracking_points[frame_number]:
pts = self.tracking_points[frame_number]
return (sum(p[0] for p in pts) / len(pts), sum(p[1] for p in pts) / len(pts))
if frame_number < frames[0]:
pts = self.tracking_points[frames[0]]
return (sum(p[0] for p in pts) / len(pts), sum(p[1] for p in pts) / len(pts)) if pts else None
if frame_number > frames[-1]:
pts = self.tracking_points[frames[-1]]
return (sum(p[0] for p in pts) / len(pts), sum(p[1] for p in pts) / len(pts)) if pts else None
for i in range(len(frames) - 1):
f1, f2 = frames[i], frames[i + 1]
if f1 <= frame_number <= f2:
pts1 = self.tracking_points.get(f1) or []
pts2 = self.tracking_points.get(f2) or []
if not pts1 or not pts2:
continue
x1 = sum(p[0] for p in pts1) / len(pts1)
y1 = sum(p[1] for p in pts1) / len(pts1)
x2 = sum(p[0] for p in pts2) / len(pts2)
y2 = sum(p[1] for p in pts2) / len(pts2)
t = (frame_number - f1) / (f2 - f1) if f2 != f1 else 0.0
return (x1 + t * (x2 - x1), y1 + t * (y2 - y1))
return None
def _get_display_params(self):
"""Unified display transform parameters for current frame in rotated space."""
eff_x, eff_y, eff_w, eff_h = self._get_effective_crop_rect_for_frame(getattr(self, 'current_frame', 0))
new_w = int(eff_w * self.zoom_factor)
new_h = int(eff_h * self.zoom_factor)
cropped_due_to_zoom = (self.zoom_factor != 1.0) and (new_w > self.window_width or new_h > self.window_height)
if cropped_due_to_zoom:
offx_max = max(0, new_w - self.window_width)
offy_max = max(0, new_h - self.window_height)
offx = max(0, min(int(self.display_offset[0]), offx_max))
offy = max(0, min(int(self.display_offset[1]), offy_max))
visible_w = min(new_w, self.window_width)
visible_h = min(new_h, self.window_height)
else:
offx = 0
offy = 0
visible_w = new_w
visible_h = new_h
available_height = self.window_height - (0 if self.is_image_mode else self.TIMELINE_HEIGHT)
scale_raw = min(self.window_width / max(1, visible_w), available_height / max(1, visible_h))
scale = scale_raw if scale_raw < 1.0 else 1.0
final_w = int(visible_w * scale)
final_h = int(visible_h * scale)
start_x = (self.window_width - final_w) // 2
start_y = (available_height - final_h) // 2
return {
'eff_x': eff_x, 'eff_y': eff_y, 'eff_w': eff_w, 'eff_h': eff_h,
'offx': offx, 'offy': offy,
'scale': scale,
'start_x': start_x, 'start_y': start_y,
'visible_w': visible_w, 'visible_h': visible_h,
'available_h': available_height
}
def _map_original_to_screen(self, ox, oy):
"""Map a point in original frame coords to canvas screen coords."""
frame_number = getattr(self, 'current_frame', 0)
# Since crop is applied after rotation, mapping to rotated space uses only rotation
angle = self.rotation_angle
if angle == 90:
rx, ry = oy, self.frame_width - 1 - ox
elif angle == 180:
rx, ry = self.frame_width - 1 - ox, self.frame_height - 1 - oy
elif angle == 270:
rx, ry = self.frame_height - 1 - oy, ox
else:
rx, ry = ox, oy
# Now account for crop/zoom/offset using unified params
params = self._get_display_params()
rx -= params['eff_x']
ry -= params['eff_y']
zx = rx * self.zoom_factor
zy = ry * self.zoom_factor
inframe_x = zx - params['offx']
inframe_y = zy - params['offy']
sx = int(round(params['start_x'] + inframe_x * params['scale']))
sy = int(round(params['start_y'] + inframe_y * params['scale']))
return sx, sy
def _map_screen_to_original(self, sx, sy):
"""Map a point on canvas screen coords back to original frame coords."""
frame_number = getattr(self, 'current_frame', 0)
angle = self.rotation_angle
ch, cw = self.frame_height, self.frame_width
# Zoomed dimensions
if angle in (90, 270):
rotated_w, rotated_h = ch, cw
else:
rotated_w, rotated_h = cw, ch
new_w = int(rotated_w * self.zoom_factor)
new_h = int(rotated_h * self.zoom_factor)
# Whether apply_crop_zoom_and_rotation cropped due to zoom
cropped_due_to_zoom = (self.zoom_factor != 1.0) and (new_w > self.window_width or new_h > self.window_height)
# Visible dims before canvas scaling
visible_w = new_w if not cropped_due_to_zoom else min(new_w, self.window_width)
visible_h = new_h if not cropped_due_to_zoom else min(new_h, self.window_height)
# Canvas scale and placement
available_height = self.window_height - (0 if self.is_image_mode else self.TIMELINE_HEIGHT)
scale_raw = min(self.window_width / max(1, visible_w), available_height / max(1, visible_h))
scale_canvas = scale_raw if scale_raw < 1.0 else 1.0
final_w = int(visible_w * scale_canvas)
final_h = int(visible_h * scale_canvas)
start_x_canvas = (self.window_width - final_w) // 2
start_y_canvas = (available_height - final_h) // 2
# Back to processed (zoomed+cropped) space
zx = (sx - start_x_canvas) / max(1e-6, scale_canvas)
zy = (sy - start_y_canvas) / max(1e-6, scale_canvas)
# Add display offset in zoomed space (only if cropped_due_to_zoom)
if cropped_due_to_zoom:
offx_max = max(0, new_w - self.window_width)
offy_max = max(0, new_h - self.window_height)
offx = max(0, min(int(self.display_offset[0]), offx_max))
offy = max(0, min(int(self.display_offset[1]), offy_max))
else:
offx = 0
offy = 0
zx += offx
zy += offy
# Reverse zoom
rx = zx / max(1e-6, self.zoom_factor)
ry = zy / max(1e-6, self.zoom_factor)
# Reverse crop in rotated space to get rotated coordinates
cx, cy, cw, ch = self._get_effective_crop_rect_for_frame(frame_number)
rx = rx + cx
ry = ry + cy
# Reverse rotation to original frame coords
if angle == 90:
ox, oy = self.frame_width - 1 - ry, rx
elif angle == 180:
ox, oy = self.frame_width - 1 - rx, self.frame_height - 1 - ry
elif angle == 270:
ox, oy = ry, self.frame_height - 1 - rx
else:
ox, oy = rx, ry
ox = max(0, min(int(round(ox)), self.frame_width - 1))
oy = max(0, min(int(round(oy)), self.frame_height - 1))
return ox, oy
def _map_rotated_to_screen(self, rx, ry):
"""Map a point in ROTATED frame coords to canvas screen coords (post-crop)."""
# Subtract crop offset in rotated space (EFFECTIVE crop at current frame)
cx, cy, cw, ch = self._get_effective_crop_rect_for_frame(getattr(self, 'current_frame', 0))
rx2 = rx - cx
ry2 = ry - cy
# Zoomed dimensions of cropped-rotated frame
new_w = int(cw * self.zoom_factor)
new_h = int(ch * self.zoom_factor)
cropped_due_to_zoom = (self.zoom_factor != 1.0) and (new_w > self.window_width or new_h > self.window_height)
if cropped_due_to_zoom:
offx_max = max(0, new_w - self.window_width)
offy_max = max(0, new_h - self.window_height)
offx = max(0, min(int(self.display_offset[0]), offx_max))
offy = max(0, min(int(self.display_offset[1]), offy_max))
else:
offx = 0
offy = 0
zx = rx2 * self.zoom_factor - offx
zy = ry2 * self.zoom_factor - offy
visible_w = new_w if not cropped_due_to_zoom else min(new_w, self.window_width)
visible_h = new_h if not cropped_due_to_zoom else min(new_h, self.window_height)
available_height = self.window_height - (0 if self.is_image_mode else self.TIMELINE_HEIGHT)
scale_raw = min(self.window_width / max(1, visible_w), available_height / max(1, visible_h))
scale_canvas = scale_raw if scale_raw < 1.0 else 1.0
final_w = int(visible_w * scale_canvas)
final_h = int(visible_h * scale_canvas)
start_x_canvas = (self.window_width - final_w) // 2
start_y_canvas = (available_height - final_h) // 2
sx = int(round(start_x_canvas + zx * scale_canvas))
sy = int(round(start_y_canvas + zy * scale_canvas))
return sx, sy
def _map_screen_to_rotated(self, sx, sy):
"""Map a point on canvas screen coords back to ROTATED frame coords (pre-crop)."""
frame_number = getattr(self, 'current_frame', 0)
angle = self.rotation_angle
# Use unified display params
params = self._get_display_params()
# Back to processed (zoomed+cropped) space
zx = (sx - params['start_x']) / max(1e-6, params['scale'])
zy = (sy - params['start_y']) / max(1e-6, params['scale'])
zx += params['offx']
zy += params['offy']
# Reverse zoom
rx = zx / max(1e-6, self.zoom_factor)
ry = zy / max(1e-6, self.zoom_factor)
# Unapply current EFFECTIVE crop to get PRE-crop rotated coords
rx = rx + params['eff_x']
ry = ry + params['eff_y']
return int(round(rx)), int(round(ry))
def clear_transformation_cache(self):
"""Clear the cached transformation to force recalculation"""
self.cached_transformed_frame = None
@@ -1665,10 +1954,13 @@ class VideoEditor:
seek_multiplier_text = (
f" | Seek: {self.seek_multiplier:.1f}x" if self.seek_multiplier != 1.0 else ""
)
motion_text = (
f" | Motion: {self.tracking_enabled}" if self.tracking_enabled else ""
)
if self.is_image_mode:
info_text = f"Image | Zoom: {self.zoom_factor:.1f}x{rotation_text}{brightness_text}{contrast_text}"
info_text = f"Image | Zoom: {self.zoom_factor:.1f}x{rotation_text}{brightness_text}{contrast_text}{motion_text}"
else:
info_text = f"Frame: {self.current_frame}/{self.total_frames} | Speed: {self.playback_speed:.1f}x | Zoom: {self.zoom_factor:.1f}x{seek_multiplier_text}{rotation_text}{brightness_text}{contrast_text} | {'Playing' if self.is_playing else 'Paused'}"
info_text = f"Frame: {self.current_frame}/{self.total_frames} | Speed: {self.playback_speed:.1f}x | Zoom: {self.zoom_factor:.1f}x{seek_multiplier_text}{rotation_text}{brightness_text}{contrast_text}{motion_text} | {'Playing' if self.is_playing else 'Paused'}"
cv2.putText(
canvas,
info_text,
@@ -1754,6 +2046,49 @@ class VideoEditor:
1,
)
# Draw tracking overlays (points and interpolated cross), points stored in ROTATED space
pts = self.tracking_points.get(self.current_frame, []) if not self.is_image_mode else []
for (rx, ry) in pts:
sx, sy = self._map_rotated_to_screen(rx, ry)
cv2.circle(canvas, (sx, sy), 6, (255, 0, 0), -1)
cv2.circle(canvas, (sx, sy), 6, (255, 255, 255), 1)
# Draw previous and next frame tracking points with 50% alpha
if not self.is_image_mode and self.tracking_points:
# Previous frame tracking points (red)
prev_frame = self.current_frame - 1
if prev_frame in self.tracking_points:
prev_pts = self.tracking_points[prev_frame]
for (rx, ry) in prev_pts:
sx, sy = self._map_rotated_to_screen(rx, ry)
# Create overlay for alpha blending
overlay = canvas.copy()
cv2.circle(overlay, (sx, sy), 4, (0, 0, 255), -1) # Red circle
cv2.addWeighted(overlay, 0.5, canvas, 0.5, 0, canvas)
# Next frame tracking points (green)
next_frame = self.current_frame + 1
if next_frame in self.tracking_points:
next_pts = self.tracking_points[next_frame]
for (rx, ry) in next_pts:
sx, sy = self._map_rotated_to_screen(rx, ry)
# Create overlay for alpha blending
overlay = canvas.copy()
cv2.circle(overlay, (sx, sy), 4, (0, 255, 0), -1) # Green circle
cv2.addWeighted(overlay, 0.5, canvas, 0.5, 0, canvas)
if self.tracking_enabled and not self.is_image_mode:
interp = self._get_interpolated_tracking_position(self.current_frame)
if interp:
sx, sy = self._map_rotated_to_screen(interp[0], interp[1])
cv2.line(canvas, (sx - 10, sy), (sx + 10, sy), (255, 0, 0), 2)
cv2.line(canvas, (sx, sy - 10), (sx, sy + 10), (255, 0, 0), 2)
# Draw a faint outline of the effective crop to confirm follow
eff_x, eff_y, eff_w, eff_h = self._get_effective_crop_rect_for_frame(self.current_frame)
# Map rotated crop corners to screen for debug outline
tlx, tly = self._map_rotated_to_screen(eff_x, eff_y)
brx, bry = self._map_rotated_to_screen(eff_x + eff_w, eff_y + eff_h)
cv2.rectangle(canvas, (tlx, tly), (brx, bry), (255, 0, 0), 1)
# Draw timeline
self.draw_timeline(canvas)
@@ -1789,6 +2124,7 @@ class VideoEditor:
if flags & cv2.EVENT_FLAG_SHIFTKEY:
if event == cv2.EVENT_LBUTTONDOWN:
print(f"DEBUG: Crop start at screen=({x},{y}) frame={getattr(self, 'current_frame', -1)}")
self.crop_selecting = True
self.crop_start_point = (x, y)
self.crop_preview_rect = None
@@ -1802,6 +2138,7 @@ class VideoEditor:
self.crop_preview_rect = (crop_x, crop_y, width, height)
elif event == cv2.EVENT_LBUTTONUP and self.crop_selecting:
if self.crop_start_point and self.crop_preview_rect:
print(f"DEBUG: Crop end screen_rect={self.crop_preview_rect}")
# Convert screen coordinates to video coordinates
self.set_crop_from_screen_coords(self.crop_preview_rect)
self.crop_selecting = False
@@ -1812,6 +2149,32 @@ class VideoEditor:
if flags & cv2.EVENT_FLAG_CTRLKEY and event == cv2.EVENT_LBUTTONDOWN:
self.zoom_center = (x, y)
# Handle right-click for tracking points (no modifiers)
if event == cv2.EVENT_RBUTTONDOWN and not (flags & (cv2.EVENT_FLAG_CTRLKEY | cv2.EVENT_FLAG_SHIFTKEY)):
if not self.is_image_mode:
# Store tracking points in ROTATED frame coordinates (pre-crop)
rx, ry = self._map_screen_to_rotated(x, y)
threshold = 50
removed = False
if self.current_frame in self.tracking_points:
pts_screen = []
for idx, (px, py) in enumerate(self.tracking_points[self.current_frame]):
sxp, syp = self._map_rotated_to_screen(px, py)
pts_screen.append((idx, sxp, syp))
for idx, sxp, syp in pts_screen:
if (sxp - x) ** 2 + (syp - y) ** 2 <= threshold ** 2:
del self.tracking_points[self.current_frame][idx]
if not self.tracking_points[self.current_frame]:
del self.tracking_points[self.current_frame]
# self.show_feedback_message("Tracking point removed")
removed = True
break
if not removed:
self.tracking_points.setdefault(self.current_frame, []).append((int(rx), int(ry)))
# self.show_feedback_message("Tracking point added")
self.clear_transformation_cache()
self.save_state()
# Handle scroll wheel for zoom (Ctrl + scroll)
if flags & cv2.EVENT_FLAG_CTRLKEY:
if event == cv2.EVENT_MOUSEWHEEL:
@@ -1832,119 +2195,67 @@ class VideoEditor:
if self.current_display_frame is None:
return
# Get the original frame dimensions
original_height, original_width = self.current_display_frame.shape[:2]
available_height = self.window_height - (0 if self.is_image_mode else self.TIMELINE_HEIGHT)
# Debug context for crop mapping
print("DEBUG: set_crop_from_screen_coords")
print(f"DEBUG: input screen_rect=({x},{y},{w},{h})")
print(f"DEBUG: state rotation={self.rotation_angle} zoom={self.zoom_factor} window=({self.window_width},{self.window_height})")
print(f"DEBUG: display_offset={self.display_offset} is_image_mode={self.is_image_mode}")
print(f"DEBUG: current crop_rect={self.crop_rect}")
eff = self._get_effective_crop_rect_for_frame(getattr(self, 'current_frame', 0)) if self.crop_rect else None
print(f"DEBUG: effective_crop_for_frame={eff}")
# Calculate how the original frame is displayed (after crop/zoom/rotation)
display_frame = self.apply_crop_zoom_and_rotation(
self.current_display_frame.copy()
)
if display_frame is None:
return
# Map both corners from screen to ROTATED space, then derive crop in rotated coords
x2 = x + w
y2 = y + h
rx1, ry1 = self._map_screen_to_rotated(x, y)
rx2, ry2 = self._map_screen_to_rotated(x2, y2)
print(f"DEBUG: mapped ROTATED corners -> ({rx1},{ry1}) and ({rx2},{ry2})")
left_r = min(rx1, rx2)
top_r = min(ry1, ry2)
right_r = max(rx1, rx2)
bottom_r = max(ry1, ry2)
crop_x = left_r
crop_y = top_r
crop_w = max(10, right_r - left_r)
crop_h = max(10, bottom_r - top_r)
display_height, display_width = display_frame.shape[:2]
# Calculate scale for the display frame
scale = min(
self.window_width / display_width, available_height / display_height
)
if scale < 1.0:
final_display_width = int(display_width * scale)
final_display_height = int(display_height * scale)
# Clamp to rotated frame bounds
if self.rotation_angle in (90, 270):
rot_w, rot_h = self.frame_height, self.frame_width
else:
final_display_width = display_width
final_display_height = display_height
scale = 1.0
rot_w, rot_h = self.frame_width, self.frame_height
crop_x = max(0, min(crop_x, rot_w - 1))
crop_y = max(0, min(crop_y, rot_h - 1))
crop_w = min(crop_w, rot_w - crop_x)
crop_h = min(crop_h, rot_h - crop_y)
start_x = (self.window_width - final_display_width) // 2
start_y = (available_height - final_display_height) // 2
print(f"DEBUG: final ROTATED_rect=({crop_x},{crop_y},{crop_w},{crop_h}) rotated_size=({rot_w},{rot_h})")
# Convert screen coordinates to display frame coordinates
display_x = (x - start_x) / scale
display_y = (y - start_y) / scale
display_w = w / scale
display_h = h / scale
# Clamp to display frame bounds
display_x = max(0, min(display_x, display_width))
display_y = max(0, min(display_y, display_height))
display_w = min(display_w, display_width - display_x)
display_h = min(display_h, display_height - display_y)
# Now we need to convert from the display frame coordinates back to original frame coordinates
# The display frame is the result of: original -> crop -> rotation -> zoom
# Step 1: Reverse zoom
if self.zoom_factor != 1.0:
display_x = display_x / self.zoom_factor
display_y = display_y / self.zoom_factor
display_w = display_w / self.zoom_factor
display_h = display_h / self.zoom_factor
# Step 2: Reverse rotation
if self.rotation_angle != 0:
# Get the dimensions of the frame after crop but before rotation
if self.crop_rect:
crop_w, crop_h = int(self.crop_rect[2]), int(self.crop_rect[3])
else:
crop_w, crop_h = original_width, original_height
# Apply inverse rotation to coordinates
# The key insight: we need to use the dimensions of the ROTATED frame for the coordinate transformation
# because the coordinates we have are in the rotated coordinate system
if self.rotation_angle == 90:
# 90° clockwise rotation: (x,y) -> (y, rotated_width-x-w)
# The rotated frame has dimensions: height x width (swapped)
rotated_w, rotated_h = crop_h, crop_w
new_x = display_y
new_y = rotated_w - display_x - display_w
new_w = display_h
new_h = display_w
elif self.rotation_angle == 180:
# 180° rotation: (x,y) -> (width-x-w, height-y-h)
new_x = crop_w - display_x - display_w
new_y = crop_h - display_y - display_h
new_w = display_w
new_h = display_h
elif self.rotation_angle == 270:
# 270° clockwise rotation: (x,y) -> (rotated_height-y-h, x)
# The rotated frame has dimensions: height x width (swapped)
rotated_w, rotated_h = crop_h, crop_w
new_x = rotated_h - display_y - display_h
new_y = display_x
new_w = display_h
new_h = display_w
else:
new_x, new_y, new_w, new_h = display_x, display_y, display_w, display_h
display_x, display_y, display_w, display_h = new_x, new_y, new_w, new_h
# Step 3: Convert from cropped frame coordinates to original frame coordinates
original_x = display_x
original_y = display_y
original_w = display_w
original_h = display_h
# Add the crop offset to get back to original frame coordinates
if self.crop_rect:
crop_x, crop_y, crop_w, crop_h = self.crop_rect
original_x += crop_x
original_y += crop_y
# Clamp to original frame bounds
original_x = max(0, min(original_x, original_width))
original_y = max(0, min(original_y, original_height))
original_w = min(original_w, original_width - original_x)
original_h = min(original_h, original_height - original_y)
if original_w > 10 and original_h > 10: # Minimum size check
# Save current crop for undo
# Snap to full rotated frame if selection covers it
if crop_w >= int(0.9 * rot_w) and crop_h >= int(0.9 * rot_h):
if self.crop_rect:
self.crop_history.append(self.crop_rect)
self.crop_rect = (original_x, original_y, original_w, original_h)
self.crop_rect = None
self.clear_transformation_cache()
self.save_state() # Save state when crop is set
self.save_state()
print("DEBUG: selection ~full frame -> clearing crop (use full frame)")
return
if crop_w > 10 and crop_h > 10:
if self.crop_rect:
self.crop_history.append(self.crop_rect)
# Store crop in ROTATED frame coordinates
self.crop_rect = (crop_x, crop_y, crop_w, crop_h)
self.clear_transformation_cache()
self.save_state()
print(f"DEBUG: crop_rect (ROTATED space) set -> {self.crop_rect}")
# Disable motion tracking upon explicit crop set to avoid unintended offsets
if self.tracking_enabled:
self.tracking_enabled = False
print("DEBUG: tracking disabled due to manual crop set")
self.save_state()
else:
print("DEBUG: rejected small crop (<=10px)")
def seek_to_timeline_position(self, mouse_x, bar_x_start, bar_width):
"""Seek to position based on mouse click on timeline"""
@@ -2118,21 +2429,10 @@ class VideoEditor:
# Send progress update
self.render_progress_queue.put(("progress", "Calculating output dimensions...", 0.05, 0.0))
# Calculate output dimensions (accounting for rotation)
if self.crop_rect:
crop_width = int(self.crop_rect[2])
crop_height = int(self.crop_rect[3])
else:
crop_width = self.frame_width
crop_height = self.frame_height
# Swap dimensions if rotation is 90 or 270 degrees
if self.rotation_angle == 90 or self.rotation_angle == 270:
output_width = int(crop_height * self.zoom_factor)
output_height = int(crop_width * self.zoom_factor)
else:
output_width = int(crop_width * self.zoom_factor)
output_height = int(crop_height * self.zoom_factor)
# Calculate output dimensions to MATCH preview visible region
params = self._get_display_params()
output_width = max(2, params['visible_w'] - (params['visible_w'] % 2))
output_height = max(2, params['visible_h'] - (params['visible_h'] % 2))
# Ensure dimensions are divisible by 2 for H.264 encoding
output_width = output_width - (output_width % 2)
@@ -2142,9 +2442,10 @@ class VideoEditor:
self.render_progress_queue.put(("progress", "Setting up FFmpeg encoder...", 0.1, 0.0))
# Debug output dimensions
print(f"Output dimensions: {output_width}x{output_height}")
print(f"Output dimensions (match preview): {output_width}x{output_height}")
print(f"Zoom factor: {self.zoom_factor}")
print(f"Crop dimensions: {crop_width}x{crop_height}")
eff_x, eff_y, eff_w, eff_h = self._get_effective_crop_rect_for_frame(start_frame)
print(f"Effective crop (rotated): {eff_x},{eff_y} {eff_w}x{eff_h}")
# Skip all the OpenCV codec bullshit and go straight to FFmpeg
print("Using FFmpeg for encoding with OpenCV transformations...")
@@ -2291,32 +2592,48 @@ class VideoEditor:
return False
def _process_frame_for_render(self, frame, output_width: int, output_height: int):
def _process_frame_for_render(self, frame, output_width: int, output_height: int, frame_number: int = None):
"""Process a single frame for rendering (optimized for speed)"""
try:
# Apply crop (vectorized operation)
if self.crop_rect:
x, y, w, h = map(int, self.crop_rect)
# Apply rotation first to work in rotated space
if self.rotation_angle != 0:
frame = self.apply_rotation(frame)
# Clamp coordinates to frame bounds
h_frame, w_frame = frame.shape[:2]
x = max(0, min(x, w_frame - 1))
y = max(0, min(y, h_frame - 1))
w = min(w, w_frame - x)
h = min(h, h_frame - y)
# Apply EFFECTIVE crop regardless of whether a base crop exists, to enable follow and out-of-frame pad
x, y, w, h = self._get_effective_crop_rect_for_frame(frame_number or self.current_frame)
if w > 0 and h > 0:
frame = frame[y : y + h, x : x + w]
else:
return None
# Allow out-of-bounds by padding with black so center can remain when near edges
h_frame, w_frame = frame.shape[:2]
pad_left = max(0, -x)
pad_top = max(0, -y)
pad_right = max(0, (x + w) - w_frame)
pad_bottom = max(0, (y + h) - h_frame)
if any(p > 0 for p in (pad_left, pad_top, pad_right, pad_bottom)):
frame = cv2.copyMakeBorder(
frame,
pad_top,
pad_bottom,
pad_left,
pad_right,
borderType=cv2.BORDER_CONSTANT,
value=(0, 0, 0),
)
x = x + pad_left
y = y + pad_top
w_frame, h_frame = frame.shape[1], frame.shape[0]
# Clamp crop to padded frame
x = max(0, min(x, w_frame - 1))
y = max(0, min(y, h_frame - 1))
w = min(w, w_frame - x)
h = min(h, h_frame - y)
if w <= 0 or h <= 0:
return None
frame = frame[y : y + h, x : x + w]
# Apply brightness and contrast
frame = self.apply_brightness_contrast(frame)
# Apply rotation
if self.rotation_angle != 0:
frame = self.apply_rotation(frame)
# Apply zoom and resize directly to final output dimensions
if self.zoom_factor != 1.0:
height, width = frame.shape[:2]
@@ -2409,7 +2726,7 @@ class VideoEditor:
if not ret:
break
processed_frame = self._process_frame_for_render(frame, output_width, output_height)
processed_frame = self._process_frame_for_render(frame, output_width, output_height, start_frame + i)
if processed_frame is not None:
if i == 0:
print(f"Processed frame dimensions: {processed_frame.shape[1]}x{processed_frame.shape[0]}")
@@ -2500,6 +2817,11 @@ class VideoEditor:
print(" U: Undo crop")
print(" C: Clear crop")
print()
print("Motion Tracking:")
print(" Right-click: Add/remove tracking point (at current frame)")
print(" v: Toggle motion tracking on/off")
print(" V: Clear all tracking points")
print()
print("Other Controls:")
print(" Ctrl+Scroll: Zoom in/out")
print(" Shift+S: Save screenshot")
@@ -2539,6 +2861,8 @@ class VideoEditor:
print(" 1: Set cut start point")
print(" 2: Set cut end point")
print(" T: Toggle loop between markers")
print(" ,: Jump to previous marker")
print(" .: Jump to next marker")
if len(self.video_files) > 1:
print(" N: Next video")
print(" n: Previous video")
@@ -2652,6 +2976,14 @@ class VideoEditor:
if not self.is_image_mode:
if not self.auto_repeat_active:
self.start_auto_repeat_seek(1, False, True) # Ctrl+D: +60 frames
elif key == ord(","):
# Jump to previous marker (cut start or end)
if not self.is_image_mode:
self.jump_to_previous_marker()
elif key == ord("."):
# Jump to next marker (cut start or end)
if not self.is_image_mode:
self.jump_to_next_marker()
elif key == ord("-") or key == ord("_"):
self.rotate_clockwise()
print(f"Rotated to {self.rotation_angle}°")
@@ -2772,6 +3104,16 @@ class VideoEditor:
else:
print(f"DEBUG: File '{self.video_path.stem}' does not contain '_edited_'")
print("Enter key only overwrites files with '_edited_' in the name. Use 'n' to create new files.")
elif key == ord("v"):
# Toggle motion tracking on/off
self.tracking_enabled = not self.tracking_enabled
self.show_feedback_message(f"Motion tracking {'ON' if self.tracking_enabled else 'OFF'}")
self.save_state()
elif key == ord("V"):
# Clear all tracking points
self.tracking_points = {}
self.show_feedback_message("Tracking points cleared")
self.save_state()
elif key == ord("t"):
# Marker looping only for videos
if not self.is_image_mode:

134
croppa/spec.md Normal file
View File

@@ -0,0 +1,134 @@
# Croppa - Feature Specification
## Overview
Croppa is a lightweight video and image editor that provides real-time editing capabilities with persistent state management.
## Notes:
Note the distinction between lowercase and uppercase keys
Uppercase keys imply shift+key
Note that every transformation (cropping, motion track points) are almost always applied to an already transformed frame
Be that by rotation, cropping, zooming or motion tracking itself
Which means that the user input must be "de-transformed" before being applied to the frame
In other words if we zoom into an area and right click to add a tracking point it must be added to that exact pixel ON THE ORIGINAL FRAME
And NOT the zoomed in / processed frame
Likwise with rotations
All coordinates (crop region, zoom center, motion tracking point) are to be in reference to the original raw unprocessed frame
To then display these points they must be transformed to the processed - display - frame
Likewise when accepting user input from the processed display frame the coordinates must be transformed back to the original raw unprocessed frame
A simple example if we are rotated by 90 degrees and click on the top left corner of the display frame
That coordinate is to be mapped to the bottom left corner of the original raw unprocessed frame
The input to the editor is either a list of video files
Or a directory
In the case a directory is provided the editor is to open "all" editable files in the given directory
In the case multiple files are open we are able to navigate between them using n and N keys for next and previous file
Be careful to save and load settings when navigating this way
## Core Features
### Video Playback
- **Space**: Play/pause video
- **a/d**: Seek backward/forward 1 frame
- **A/D**: Seek backward/forward 10 frames
- **Ctrl+a/d**: Seek backward/forward 60 frames
- **W/S**: Increase/decrease playback speed (0.1x to 10.0x, increments of 0.2)
- **Q/Y**: Increase/decrease seek multiplier (multiplies the frame count for a/d/A/D/Ctrl+a/d keys by 1.0x to 100.0x, increments of 2.0)
- **q**: Quit the program
- **Timeline**: Click anywhere to jump to that position
- **Auto-repeat**: Hold seek keys for continuous seeking at 1 FPS rate
### Visual Transformations
- **-**: Rotate 90 degrees clockwise
- **e/E**: Increase/decrease brightness (-100 to +100, increments of 5)
- **r/R**: Increase/decrease contrast (0.1 to 3.0, increments of 0.1)
- **Ctrl+Scroll**: Zoom in/out (0.1x to 10.0x, increments of 0.1)
- **Ctrl+Click**: Set zoom center point
### Cropping
- **Shift+Click+Drag**: Select crop area with green rectangle preview
- **h/j/k/l**: Expand crop from right/down/up/left edges (15 pixels per keypress)
- **H/J/K/L**: Contract crop to left/down/up/right edges (15 pixels per keypress)
- **u**: Undo last crop
- **c**: Clear all cropping
### Motion Tracking
- **Right-click**: Add tracking point (green circle with white border)
- **Right-click existing point**: Remove tracking point (within 50px)
- **v**: Toggle motion tracking on/off
- **V**: Clear all tracking points
- **Blue cross**: Shows computed tracking position
- **Automatic interpolation**: Tracks between keyframes
- **Crop follows**: Crop area centers on tracked object
- **Display** Points are rendered as blue dots per frame, in addition dots are rendered on each frame for each dot on the previous (in red) and next (in green) frame
#### Motion Tracking Navigation
- **,**: Jump to previous tracking marker (previous frame that has one or more tracking points). Wrap-around supported.
- **.**: Jump to next tracking marker (next frame that has one or more tracking points). Wrap-around supported.
### Markers and Looping
- **1**: Set cut start marker at current frame
- **2**: Set cut end marker at current frame
- **t**: Toggle loop playback between markers
- **Red lines**: Markers shown on timeline with numbers
- **Continuous loop**: Playback loops between markers when enabled
### File Management
- **Enter**: Render video (overwrites if filename contains "_edited_")
- **b**: Render video with new "_edited_001" filename (does NOT overwrite!)
- **s**: Save screenshot with auto-incrementing filename (video_frame_00001.jpg, video_frame_00002.jpg, etc. - NEVER overwrite existing screenshots)
- **N/n**: Next/previous video in directory
- **p**: Toggle project view (directory browser)
### Project View
- **wasd**: Navigate through video thumbnails
- **e**: Open selected video
- **Q/Y**: Change thumbnail size (fewer/more per row, size automatically computed to fit row)
- **q**: Quit
- **Progress bars**: Show editing progress for each video (blue bar showing current_frame/total_frames)
- **ESC**: Return to editor
### Display and Interface
- **f**: Toggle fullscreen
- **Status overlay**: Shows "Frame: 1500/3000 | Speed: 1.5x | Zoom: 2.0x | Seek: 5.0x | Rotation: 90° | Brightness: 10 | Contrast: 1.2 | Motion: ON (3 pts) | Playing/Paused"
- **Timeline**: Visual progress bar with current position handle
- **Feedback messages**: Temporary on-screen notifications (e.g. "Screenshot saved: video_frame_00001.jpg")
- **Progress bar**: Shows rendering progress with FPS counter (e.g. "Processing 1500/3000 frames | 25.3 FPS")
### State Management
- **Auto-save**: Settings saved automatically on changes and on quit
- **Per-video state**: Each video remembers its own settings
- **Cross-session**: Settings persist between application restarts
- **JSON files**: State stored as .json files next to videos with the same name as the video
### Rendering
- **Background rendering**: Continue editing while rendering (rendering happens in separate thread, you can still seek/play/edit)
- **x**: Cancel active render
- **FFmpeg output**: Invoke FFmpeg process, pipe raw video frames via stdin, output MP4 with H.264 encoding (CRF 18, preset fast)
- **Progress tracking**: Real-time progress with FPS display
- **Overwrite protection**: Only overwrites files with "_edited_" in name
### Image Mode
- **Same controls**: All editing features work on static images
- **No playback**: Space key disabled, no timeline
- **Screenshot mode**: Treats single images like video frames
### Error Handling
- **Format support**: MP4, AVI, MOV, MKV, WMV, FLV, WebM, M4V, JPG, PNG, BMP, TIFF, WebP
- **Backend fallback**: Tries multiple video backends automatically
- **Error messages**: Clear feedback for common issues
- **Graceful degradation**: Continues working when possible
### Performance Features
- **Frame caching**: Smooth seeking with cached frames (cache the decoded frames, LRU eviction, max 3000 frames)
- **Transformation caching**: Fast repeated operations (cache transformed frames during auto-repeat seeking)
- **Memory management**: Automatic cache cleanup
### Window Management
- **Resizable**: Window can be resized dynamically
- **Multi-window**: Project view opens in separate window
- **Focus handling**: Keys only affect active window
- **Context menu**: Right-click integration on Windows
This specification describes what Croppa does from the user's perspective - the features, controls, and behaviors that make up the application.

205
croppa/tracking.py Normal file
View File

@@ -0,0 +1,205 @@
from typing import List, Dict, Tuple, Optional, NamedTuple
class TrackingPoint(NamedTuple):
"""Represents a tracking point with both original and display coordinates"""
original: Tuple[float, float] # Original frame coordinates (x, y)
display: Optional[Tuple[float, float]] = None # Display coordinates after transformation (x, y)
def __str__(self):
if self.display:
return f"TrackingPoint(orig={self.original}, display={self.display})"
return f"TrackingPoint(orig={self.original})"
class MotionTracker:
"""Handles motion tracking for crop and pan operations"""
def __init__(self):
self.tracking_points = {} # {frame_number: [TrackingPoint, ...]}
self.tracking_enabled = False
self.base_crop_rect = None # Original crop rect when tracking started
self.base_zoom_center = None # Original zoom center when tracking started
def add_tracking_point(self, frame_number: int, x: float, y: float):
"""Add a tracking point at the specified frame and coordinates
Args:
frame_number: The frame number to add the point to
x: Original x coordinate
y: Original y coordinate
"""
if frame_number not in self.tracking_points:
self.tracking_points[frame_number] = []
# Store only the original coordinates - display coordinates will be calculated fresh each time
point = TrackingPoint(original=(float(x), float(y)))
print(f"Adding tracking point: {point}")
self.tracking_points[frame_number].append(point)
def remove_tracking_point(self, frame_number: int, x: float, y: float, radius: int = 50):
"""Remove a tracking point by frame and proximity to x,y"""
if frame_number not in self.tracking_points:
return False
points = self.tracking_points[frame_number]
for i, point in enumerate(points):
px, py = point.original
# Calculate distance between points
distance = ((px - x) ** 2 + (py - y) ** 2) ** 0.5
if distance <= radius:
print(f"Removing tracking point: {point}")
del points[i]
if not points:
del self.tracking_points[frame_number]
return True
return False
def clear_tracking_points(self):
"""Clear all tracking points"""
self.tracking_points.clear()
def get_tracking_points_for_frame(self, frame_number: int) -> List[TrackingPoint]:
"""Get all tracking points for a specific frame"""
return self.tracking_points.get(frame_number, [])
def has_tracking_points(self) -> bool:
"""Check if any tracking points exist"""
return bool(self.tracking_points)
def get_interpolated_position(self, frame_number: int) -> Optional[Tuple[float, float]]:
"""Get interpolated position for a frame based on tracking points"""
if not self.tracking_points:
return None
# Get all frames with tracking points
frames = sorted(self.tracking_points.keys())
if not frames:
return None
# If we have a point at this exact frame, return it
if frame_number in self.tracking_points:
points = self.tracking_points[frame_number]
if points:
# Return average of all points at this frame
avg_x = sum(p.original[0] for p in points) / len(points)
avg_y = sum(p.original[1] for p in points) / len(points)
return (avg_x, avg_y)
# If frame is before first tracking point
if frame_number < frames[0]:
points = self.tracking_points[frames[0]]
if points:
avg_x = sum(p.original[0] for p in points) / len(points)
avg_y = sum(p.original[1] for p in points) / len(points)
return (avg_x, avg_y)
# If frame is after last tracking point
if frame_number > frames[-1]:
points = self.tracking_points[frames[-1]]
if points:
avg_x = sum(p.original[0] for p in points) / len(points)
avg_y = sum(p.original[1] for p in points) / len(points)
return (avg_x, avg_y)
# Find the two frames to interpolate between
for i in range(len(frames) - 1):
if frames[i] <= frame_number <= frames[i + 1]:
frame1, frame2 = frames[i], frames[i + 1]
points1 = self.tracking_points[frame1]
points2 = self.tracking_points[frame2]
if not points1 or not points2:
continue
# Get average positions for each frame
avg_x1 = sum(p.original[0] for p in points1) / len(points1)
avg_y1 = sum(p.original[1] for p in points1) / len(points1)
avg_x2 = sum(p.original[0] for p in points2) / len(points2)
avg_y2 = sum(p.original[1] for p in points2) / len(points2)
# Linear interpolation
t = (frame_number - frame1) / (frame2 - frame1)
interp_x = avg_x1 + t * (avg_x2 - avg_x1)
interp_y = avg_y1 + t * (avg_y2 - avg_y1)
return (interp_x, interp_y)
return None
def get_tracking_offset(self, frame_number: int) -> Tuple[float, float]:
"""Get the offset to center the crop on the tracked point"""
if not self.tracking_enabled:
print(f"get_tracking_offset: tracking not enabled, returning (0,0)")
return (0.0, 0.0)
if not self.base_zoom_center:
print(f"get_tracking_offset: no base_zoom_center, returning (0,0)")
return (0.0, 0.0)
current_pos = self.get_interpolated_position(frame_number)
if not current_pos:
print(f"get_tracking_offset: no interpolated position for frame {frame_number}, returning (0,0)")
return (0.0, 0.0)
# Calculate offset to center the crop on the tracked point
# The offset should move the display so the tracked point stays centered
offset_x = current_pos[0] - self.base_zoom_center[0]
offset_y = current_pos[1] - self.base_zoom_center[1]
print(f"get_tracking_offset: frame={frame_number}, base={self.base_zoom_center}, current={current_pos}, offset=({offset_x}, {offset_y})")
return (offset_x, offset_y)
def start_tracking(self, base_crop_rect: Tuple[int, int, int, int], base_zoom_center: Tuple[int, int]):
"""Start motion tracking with base positions"""
self.tracking_enabled = True
self.base_crop_rect = base_crop_rect
print(f"start_tracking: base_crop_rect={base_crop_rect}, base_zoom_center={base_zoom_center}")
# If no base_zoom_center is provided, use the center of the crop rect
if base_zoom_center is None and base_crop_rect is not None:
x, y, w, h = base_crop_rect
self.base_zoom_center = (x + w//2, y + h//2)
print(f"start_tracking: using crop center as base_zoom_center: {self.base_zoom_center}")
else:
self.base_zoom_center = base_zoom_center
print(f"start_tracking: using provided base_zoom_center: {self.base_zoom_center}")
def stop_tracking(self):
"""Stop motion tracking"""
self.tracking_enabled = False
self.base_crop_rect = None
self.base_zoom_center = None
def to_dict(self) -> Dict:
"""Convert to dictionary for serialization"""
# Convert TrackingPoint objects to tuples for serialization
serialized_points = {}
for frame_num, points in self.tracking_points.items():
# Store only the original coordinates for serialization
serialized_points[frame_num] = [p.original for p in points]
return {
'tracking_points': serialized_points,
'tracking_enabled': self.tracking_enabled,
'base_crop_rect': self.base_crop_rect,
'base_zoom_center': self.base_zoom_center
}
def from_dict(self, data: Dict):
"""Load from dictionary for deserialization"""
# Convert string keys back to integers for tracking_points
tracking_points_data = data.get('tracking_points', {})
self.tracking_points = {}
for frame_str, points in tracking_points_data.items():
frame_num = int(frame_str) # Convert string key to integer
# Convert tuples to TrackingPoint objects
self.tracking_points[frame_num] = [TrackingPoint(original=p) for p in points]
self.tracking_enabled = data.get('tracking_enabled', False)
self.base_crop_rect = data.get('base_crop_rect', None)
self.base_zoom_center = data.get('base_zoom_center', None)