Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/exp/shiny/driver/internal: swizzle still needs to detect instruction on amd64 #12714

Closed
kardianos opened this issue Sep 22, 2015 · 5 comments
Closed

Comments

@kardianos
Copy link
Contributor

CPU: AMD Phenom II X6 (Thuban PH-E0) 1055T
$cpuid -1 |grep SSE4
SSE4.1 extensions = false
SSE4.2 extensions = false
SSE4A support = true

SIGILL: illegal instruction
PC=0x552cd5 m=6

goroutine 1 [running]:
golang.org/x/exp/shiny/driver/internal/swizzle.bgra16(0x7f982d522000, 0x40000, 0x40000)
/home/daniel/src/golang.org/x/exp/shiny/driver/internal/swizzle/swizzle_amd64.s:36 +0x45 fp=0xc8200517f0 sp=0xc8200517e8
golang.org/x/exp/shiny/driver/internal/swizzle.BGRA(0x7f982d522000, 0x40000, 0x40000)
/home/daniel/src/golang.org/x/exp/shiny/driver/internal/swizzle/swizzle_common.go:21 +0xcf fp=0xc820051850 sp=0xc8200517f0
golang.org/x/exp/shiny/driver/x11driver.(*bufferImpl).preUpload(0xc820106000)
/home/daniel/src/golang.org/x/exp/shiny/driver/x11driver/buffer.go:54 +0x11d fp=0xc820051898 sp=0xc820051850
...
rax 0xf0c0d0e0b08090a
rbx 0x0
rcx 0xc820000180
rdx 0x40000
rdi 0x7f982d562000
rsi 0x7f982d522000
rbp 0x40000
rsp 0xc8200517e8
r8 0x7f982d522000
r9 0x676050
r10 0x2
r11 0x246
r12 0x5
r13 0x6c24cc
r14 0x8
r15 0x0
rip 0x552cd5
rflags 0x10287
cs 0x33
fs 0x0
gs 0x0
exit status 2

@rakyll rakyll added this to the Unreleased milestone Sep 23, 2015
@rakyll
Copy link
Contributor

rakyll commented Sep 23, 2015

/cc @nigeltao

@rakyll rakyll changed the title golang.org/x/exp/shiny/driver/internal: swizzle still needs to detect instruction on amd64 x/exp/shiny/driver/internal: swizzle still needs to detect instruction on amd64 Sep 23, 2015
@nigeltao
Copy link
Contributor

What does "cpuid -1" without the grep say? I think that PSHUFB was introduced in SSSE3.

@kardianos
Copy link
Contributor Author

  SSE extensions                         = true
  SSE2 extensions                        = true
  PNI/SSE3: Prescott New Instructions     = true
  SSSE3 extensions                        = false
  SSE4.1 extensions                       = false
  SSE4.2 extensions                       = false
  SSE extensions                        = true
  SSE4A support                          = true
  misaligned SSE mode                    = true
  SSSE3/SSE5 opcode set disable = false
  128-bit SSE executed full-width = true

...

CPU:
vendor_id = "AuthenticAMD"
version information (1/eax):
processor type = primary processor (0)
family = Intel Pentium 4/Pentium D/Pentium Extreme Edition/Celeron/Xeon/Xeon MP/Itanium2, AMD Athlon 64/Athlon XP-M/Opteron/Sempron/Turion (15)
model = 0xa (10)
stepping id = 0x0 (0)
extended family = 0x1 (1)
extended model = 0x0 (0)
(simple synth) = AMD Phenom II X4 / X6 (Zosma / Thuban PH-E0), 45nm
miscellaneous (1/ebx):
process local APIC physical ID = 0x0 (0)
cpu count = 0x6 (6)
CLFLUSH line size = 0x8 (8)
brand index = 0x0 (0)
brand id = 0x00 (0): unknown
feature information (1/edx):
x87 FPU on chip = true
virtual-8086 mode enhancement = true
debugging extensions = true
page size extensions = true
time stamp counter = true
RDMSR and WRMSR support = true
physical address extensions = true
machine check exception = true
CMPXCHG8B inst. = true
APIC on chip = true
SYSENTER and SYSEXIT = true
memory type range registers = true
PTE global bit = true
machine check architecture = true
conditional move/compare instruction = true
page attribute table = true
page size extension = true
processor serial number = false
CLFLUSH instruction = true
debug store = false
thermal monitor and clock ctrl = false
MMX Technology = true
FXSAVE/FXRSTOR = true
SSE extensions = true
SSE2 extensions = true
self snoop = false
hyper-threading / multi-core supported = true
therm. monitor = false
IA64 = false
pending break event = false
feature information (1/ecx):
PNI/SSE3: Prescott New Instructions = true
PCLMULDQ instruction = false
64-bit debug store = false
MONITOR/MWAIT = true
CPL-qualified debug store = false
VMX: virtual machine extensions = false
SMX: safer mode extensions = false
Enhanced Intel SpeedStep Technology = false
thermal monitor 2 = false
SSSE3 extensions = false
context ID: adaptive or shared L1 data = false
FMA instruction = false
CMPXCHG16B instruction = true
xTPR disable = false
perfmon and debug = false
process context identifiers = false
direct cache access = false
SSE4.1 extensions = false
SSE4.2 extensions = false
extended xAPIC support = false
MOVBE instruction = false
POPCNT instruction = true
time stamp counter deadline = false
AES instruction = false
XSAVE/XSTOR states = false
OS-enabled XSAVE/XSTOR = false
AVX: advanced vector extensions = false
F16C half-precision convert instruction = false
RDRAND instruction = false
hypervisor guest status = false
cache and TLB information (2):
processor serial number: 0010-0FA0-0000-0000-0000-0000
MONITOR/MWAIT (5):
smallest monitor-line size (bytes) = 0x40 (64)
largest monitor-line size (bytes) = 0x40 (64)
enum of Monitor-MWAIT exts supported = true
supports intrs as break-event for MWAIT = true
number of C0 sub C-states using MWAIT = 0x0 (0)
number of C1 sub C-states using MWAIT = 0x0 (0)
number of C2 sub C-states using MWAIT = 0x0 (0)
number of C3 sub C-states using MWAIT = 0x0 (0)
number of C4 sub C-states using MWAIT = 0x0 (0)
number of C5 sub C-states using MWAIT = 0x0 (0)
number of C6 sub C-states using MWAIT = 0x0 (0)
number of C7 sub C-states using MWAIT = 0x0 (0)
Thermal and Power Management Features (6):
digital thermometer = false
Intel Turbo Boost Technology = false
ARAT always running APIC timer = false
PLN power limit notification = false
ECMD extended clock modulation duty = false
PTM package thermal management = false
digital thermometer thresholds = 0x0 (0)
ACNT/MCNT supported performance measure = true
ACNT2 available = false
performance-energy bias capability = false
extended processor signature (0x80000001/eax):
family/generation = AMD Athlon 64/Opteron/Sempron/Turion (15)
model = 0xa (10)
stepping id = 0x0 (0)
extended family = 0x1 (1)
extended model = 0x0 (0)
(simple synth) = AMD Phenom II X4 / X6 (Zosma / Thuban PH-E0), 45nm
extended feature flags (0x80000001/edx):
x87 FPU on chip = true
virtual-8086 mode enhancement = true
debugging extensions = true
page size extensions = true
time stamp counter = true
RDMSR and WRMSR support = true
physical address extensions = true
machine check exception = true
CMPXCHG8B inst. = true
APIC on chip = true
SYSCALL and SYSRET instructions = true
memory type range registers = true
global paging extension = true
machine check architecture = true
conditional move/compare instruction = true
page attribute table = true
page size extension = true
multiprocessing capable = false
no-execute page protection = true
AMD multimedia instruction extensions = true
MMX Technology = true
FXSAVE/FXRSTOR = true
SSE extensions = true
1-GB large page support = true
RDTSCP = true
long mode (AA-64) = true
3DNow! instruction extensions = true
3DNow! instructions = true
extended brand id (0x80000001/ebx):
raw = 0x10000050 (268435536)
BrandId = 0x50 (80)
str1 = 0x0 (0)
str2 = 0x0 (0)
PartialModel = 0x5 (5)
PG = 0x0 (0)
PkgType = 0x1 (1)
AMD feature flags (0x80000001/ecx):
LAHF/SAHF supported in 64-bit mode = true
CMP Legacy = true
SVM: secure virtual machine = true
extended APIC space = true
AltMovCr8 = true
LZCNT advanced bit manipulation = true
SSE4A support = true
misaligned SSE mode = true
3DNow! PREFETCH/PREFETCHW instructions = true
OS visible workaround = true
instruction based sampling = true
XOP support = false
SKINIT/STGI support = true
watchdog timer support = true
lightweight profiling support = false
4-operand FMA instruction = false
NodeId MSR C001100C = false
TBM support = false
topology extensions = false
brand = "AMD Phenom(tm) II X6 1055T Processor"
L1 TLB/cache information: 2M/4M pages & L1 TLB (0x80000005/eax):
instruction # entries = 0x10 (16)
instruction associativity = 0xff (255)
data # entries = 0x30 (48)
data associativity = 0xff (255)
L1 TLB/cache information: 4K pages & L1 TLB (0x80000005/ebx):
instruction # entries = 0x20 (32)
instruction associativity = 0xff (255)
data # entries = 0x30 (48)
data associativity = 0xff (255)
L1 data cache information (0x80000005/ecx):
line size (bytes) = 0x40 (64)
lines per tag = 0x1 (1)
associativity = 0x2 (2)
size (Kb) = 0x40 (64)
L1 instruction cache information (0x80000005/edx):
line size (bytes) = 0x40 (64)
lines per tag = 0x1 (1)
associativity = 0x2 (2)
size (Kb) = 0x40 (64)
L2 TLB/cache information: 2M/4M pages & L2 TLB (0x80000006/eax):
instruction # entries = 0x0 (0)
instruction associativity = L2 off (0)
data # entries = 0x80 (128)
data associativity = 2-way (2)
L2 TLB/cache information: 4K pages & L2 TLB (0x80000006/ebx):
instruction # entries = 0x200 (512)
instruction associativity = 4-way (4)
data # entries = 0x200 (512)
data associativity = 4-way (4)
L2 unified cache information (0x80000006/ecx):
line size (bytes) = 0x40 (64)
lines per tag = 0x1 (1)
associativity = 16-way (8)
size (Kb) = 0x200 (512)
L3 cache information (0x80000006/edx):
line size (bytes) = 0x40 (64)
lines per tag = 0x1 (1)
associativity = 48-way (11)
size (in 512Kb units) = 0xc (12)
Advanced Power Management Features (0x80000007/edx):
temperature sensing diode = true
frequency ID (FID) control = false
voltage ID (VID) control = false
thermal trip (TTP) = true
thermal monitor (TM) = true
software thermal control (STC) = true
100 MHz multiplier control = true
hardware P-State control = true
TscInvariant = true
Physical Address and Linear Address Size (0x80000008/eax):
maximum physical address bits = 0x30 (48)
maximum linear (virtual) address bits = 0x30 (48)
maximum guest physical address bits = 0x0 (0)
Logical CPU cores (0x80000008/ecx):
number of CPU cores - 1 = 0x5 (5)
ApicIdCoreIdSize = 0x3 (3)
SVM Secure Virtual Machine (0x8000000a/eax):
SvmRev: SVM revision = 0x1 (1)
SVM Secure Virtual Machine (0x8000000a/edx):
nested paging = true
LBR virtualization = true
SVM lock = true
NRIP save = true
MSR based TSC rate control = false
VMCB clean bits support = false
flush by ASID = false
decode assists = false
SSSE3/SSE5 opcode set disable = false
pause intercept filter = true
pause filter threshold = false
NASID: number of address space identifiers = 0x40 (64):
L1 TLB information: 1G pages (0x80000019/eax):
instruction # entries = 0x0 (0)
instruction associativity = L2 off (0)
data # entries = 0x30 (48)
data associativity = full (15)
L2 TLB information: 1G pages (0x80000019/ebx):
instruction # entries = 0x0 (0)
instruction associativity = L2 off (0)
data # entries = 0x10 (16)
data associativity = 8-way (6)
SVM Secure Virtual Machine (0x8000001a/eax):
128-bit SSE executed full-width = true
MOVU* better than MOVL_/MOVH_ = true
Instruction Based Sampling Identifiers (0x8000001b/eax):
IBS feature flags valid = true
IBS fetch sampling = true
IBS execution sampling = true
read write of op counter = true
op counting mode = true
branch target address reporting = false
IbsOpCurCnt and IbsOpMaxCnt extend 7 = false
invalid RIP indication supported = false
(instruction supported synth):
CMPXCHG8B = true
conditional move/compare = true
PREFETCH/PREFETCHW = true
(multi-processing synth): multi-core (c=6)
(multi-processing method): AMD
(APIC widths synth): CORE_width=3 SMT_width=0
(APIC synth): PKG_ID=0 CORE_ID=0 SMT_ID=0
(synth) = AMD Phenom II X6 (Thuban PH-E0), 45nm 1055T Processor

@nigeltao
Copy link
Contributor

I believe that this is fixed, but I obviously didn't try it on your CPU. Please re-open if you're still seeing problems.

@kardianos
Copy link
Contributor Author

I tried it. That did fix it on my CPU. Thanks!

On Wed, Sep 23, 2015 at 9:06 PM Nigel Tao notifications@github.com wrote:

I believe that this is fixed, but I obviously didn't try it on your CPU.
Please re-open if you're still seeing problems.


Reply to this email directly or view it on GitHub
#12714 (comment).

@golang golang locked and limited conversation to collaborators Sep 24, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants